Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.00288
Cited By
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
1 January 2021
Tongshuang Wu
Marco Tulio Ribeiro
Jeffrey Heer
Daniel S. Weld
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models"
50 / 180 papers shown
Title
Clarify: Improving Model Robustness With Natural Language Corrections
Yoonho Lee
Michelle S. Lam
Helena Vasconcelos
Michael S. Bernstein
Chelsea Finn
40
6
0
06 Feb 2024
LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-Explanations
Qianli Wang
Tatiana Anikina
Nils Feldhus
Josef van Genabith
Leonhard Hennig
Sebastian Möller
ELM
LRM
25
8
0
23 Jan 2024
Towards a Non-Ideal Methodological Framework for Responsible ML
Ramaravind Kommiya Mothilal
Shion Guha
Syed Ishtiaque Ahmed
59
7
0
20 Jan 2024
An Empirical Study of Counterfactual Visualization to Support Visual Causal Inference
Arran Zeyu Wang
D. Borland
David Gotz
CML
46
11
0
16 Jan 2024
Are self-explanations from Large Language Models faithful?
Andreas Madsen
Sarath Chandar
Siva Reddy
LRM
35
25
0
15 Jan 2024
Sparsity-Guided Holistic Explanation for LLMs with Interpretable Inference-Time Intervention
Zhen Tan
Tianlong Chen
Zhenyu Zhang
Huan Liu
52
14
0
22 Dec 2023
InstructPipe: Generating Visual Blocks Pipelines with Human Instructions and LLMs
Zhongyi Zhou
Jing Jin
Vrushank Phadnis
Xiuxiu Yuan
Jun Jiang
...
A. Olwal
David Kim
Ram Iyengar
Na Li
Andrea Colaço
38
0
0
15 Dec 2023
Using Captum to Explain Generative Language Models
Vivek Miglani
Aobo Yang
Aram H. Markosyan
Diego Garcia-Olano
Narine Kokhlikyan
42
27
0
09 Dec 2023
TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models
Aditya Chinchure
Pushkar Shukla
Gaurav Bhatt
Kiri Salij
K. Hosanagar
Leonid Sigal
Matthew A. Turk
26
24
0
03 Dec 2023
SocialCounterfactuals: Probing and Mitigating Intersectional Social Biases in Vision-Language Models with Counterfactual Examples
Phillip Howard
Avinash Madasu
Tiep Le
Gustavo Lujan Moreno
Anahita Bhiwandiwalla
Vasudev Lal
57
16
0
30 Nov 2023
Attribution and Alignment: Effects of Local Context Repetition on Utterance Production and Comprehension in Dialogue
Aron Molnar
Jaap Jumelet
Mario Giulianelli
Arabella J. Sinclair
40
2
0
21 Nov 2023
Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals
Yanai Elazar
Bhargavi Paranjape
Hao Peng
Sarah Wiegreffe
Khyathi Raghavi
Vivek Srikumar
Sameer Singh
Noah A. Smith
AAML
OOD
38
0
0
16 Nov 2023
Using Natural Language Explanations to Improve Robustness of In-context Learning
Xuanli He
Yuxiang Wu
Oana-Maria Camburu
Pasquale Minervini
Pontus Stenetorp
AAML
41
1
0
13 Nov 2023
Interpreting Pretrained Language Models via Concept Bottlenecks
Zhen Tan
Lu Cheng
Song Wang
Yuan Bo
Wenlin Yao
Huan Liu
LRM
47
20
0
08 Nov 2023
Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Sree Harsha Tanneru
Chirag Agarwal
Himabindu Lakkaraju
LRM
32
14
0
06 Nov 2023
"Honey, Tell Me What's Wrong", Global Explanation of Textual Discriminative Models through Cooperative Generation
Antoine Chaffin
Julien Delaunay
18
0
0
27 Oct 2023
Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks
Aradhana Sinha
Ananth Balashankar
Ahmad Beirami
Thi Avrahami
Jilin Chen
Alex Beutel
AAML
27
4
0
25 Oct 2023
Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups
Weiqiu You
Helen Qu
Marco Gatti
Bhuvnesh Jain
Eric Wong
FAtt
FaML
58
4
0
25 Oct 2023
Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators
Tin Nguyen
Jiannan Xu
Aayushi Roy
Hal Daumé
Marine Carpuat
40
5
0
23 Oct 2023
EXPLAIN, EDIT, GENERATE: Rationale-Sensitive Counterfactual Data Augmentation for Multi-hop Fact Verification
Yingjie Zhu
Jiasheng Si
Yibo Zhao
Haiyang Zhu
Deyu Zhou
Yulan He
46
6
0
23 Oct 2023
Faithfulness Measurable Masked Language Models
Andreas Madsen
Siva Reddy
Sarath Chandar
46
3
0
11 Oct 2023
InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations
Nils Feldhus
Qianli Wang
Tatiana Anikina
Sahil Chopra
Cennet Oguz
Sebastian Möller
45
11
0
09 Oct 2023
Model Compression in Practice: Lessons Learned from Practitioners Creating On-device Machine Learning Experiences
Fred Hohman
Mary Beth Kery
Donghao Ren
Dominik Moritz
37
16
0
06 Oct 2023
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
Xuansheng Wu
Wenlin Yao
Jianshu Chen
Xiaoman Pan
Xiaoyang Wang
Ninghao Liu
Dong Yu
LRM
28
28
0
30 Sep 2023
EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria
Tae Soo Kim
Yoonjoo Lee
Jamin Shin
Young-Ho Kim
Juho Kim
39
69
0
24 Sep 2023
Towards LLM-guided Causal Explainability for Black-box Text Classifiers
Amrita Bhattacharjee
Raha Moraffah
Joshua Garland
Huan Liu
34
35
0
23 Sep 2023
COCO-Counterfactuals: Automatically Constructed Counterfactual Examples for Image-Text Pairs
Tiep Le
Vasudev Lal
Phillip Howard
DiffM
39
21
0
23 Sep 2023
CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration
Rachneet Sachdeva
Martin Tutek
Iryna Gurevych
OODD
37
12
0
14 Sep 2023
Explainability for Large Language Models: A Survey
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Mengnan Du
LRM
39
415
0
02 Sep 2023
PEANUT: A Human-AI Collaborative Tool for Annotating Audio-Visual Data
Zheng Zhang
Zheng Ning
Chenliang Xu
Yapeng Tian
Toby Jia-Jun Li
64
6
0
27 Jul 2023
CommonsenseVIS: Visualizing and Understanding Commonsense Reasoning Capabilities of Natural Language Models
Xingbo Wang
Renfei Huang
Zhihua Jin
Tianqing Fang
Huamin Qu
VLM
ReLM
LRM
48
1
0
23 Jul 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Yanda Chen
Ruiqi Zhong
Narutatsu Ri
Chen Zhao
He He
Jacob Steinhardt
Zhou Yu
Kathleen McKeown
LRM
36
47
0
17 Jul 2023
Power-up! What Can Generative Models Do for Human Computation Workflows?
Garrett Allen
Gaole He
U. Gadiraju
51
3
0
05 Jul 2023
Concept-Based Explanations to Test for False Causal Relationships Learned by Abusive Language Classifiers
I. Nejadgholi
S. Kiritchenko
Kathleen C. Fraser
Esma Balkir
28
0
0
04 Jul 2023
On Evaluating and Mitigating Gender Biases in Multilingual Settings
Aniket Vashishtha
Kabir Ahuja
Sunayana Sitaram
39
23
0
04 Jul 2023
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Neel Jain
Khalid Saifullah
Yuxin Wen
John Kirchenbauer
Manli Shu
Aniruddha Saha
Micah Goldblum
Jonas Geiping
Tom Goldstein
ALM
ELM
38
23
0
23 Jun 2023
Towards Explainable Evaluation Metrics for Machine Translation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei Zhao
Yang Gao
Steffen Eger
ELM
43
13
0
22 Jun 2023
Towards Regulatable AI Systems: Technical Gaps and Policy Opportunities
Xudong Shen
H. Brown
Jiashu Tao
Martin Strobel
Yao Tong
Akshay Narayan
Harold Soh
Finale Doshi-Velez
37
3
0
22 Jun 2023
Which Spurious Correlations Impact Reasoning in NLI Models? A Visual Interactive Diagnosis through Data-Constrained Counterfactuals
Robin Shing Moon Chan
Afra Amini
Mennatallah El-Assady
LRM
AAML
45
2
0
21 Jun 2023
Causal Effect Regularization: Automated Detection and Removal of Spurious Attributes
Abhinav Kumar
Amit Deshpande
Ajay Sharma
CML
26
1
0
19 Jun 2023
Cross-Modal Attribute Insertions for Assessing the Robustness of Vision-and-Language Learning
Shivaen Ramshetty
Gaurav Verma
Srijan Kumar
33
2
0
19 Jun 2023
Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis, and LLMs Evaluations
Lifan Yuan
Yangyi Chen
Ganqu Cui
Hongcheng Gao
Fangyuan Zou
Xingyi Cheng
Heng Ji
Zhiyuan Liu
Maosong Sun
52
75
0
07 Jun 2023
Reason to explain: Interactive contrastive explanations (REASONX)
Laura State
Salvatore Ruggieri
Franco Turini
LRM
35
1
0
29 May 2023
Faithfulness Tests for Natural Language Explanations
Pepa Atanasova
Oana-Maria Camburu
Christina Lioma
Thomas Lukasiewicz
J. Simonsen
Isabelle Augenstein
FAtt
37
59
0
29 May 2023
CREST: A Joint Framework for Rationalization and Counterfactual Text Generation
Marcos Vinícius Treviso
Alexis Ross
Nuno M. Guerreiro
André F.T. Martins
36
16
0
26 May 2023
Counterfactuals of Counterfactuals: a back-translation-inspired approach to analyse counterfactual editors
Giorgos Filandrianos
Edmund Dervakos
Orfeas Menis Mastromichalakis
Chrysoula Zerva
Giorgos Stamou
AAML
37
5
0
26 May 2023
Controlling Learned Effects to Reduce Spurious Correlations in Text Classifiers
Parikshit Bansal
Amit Sharma
CML
28
5
0
26 May 2023
On Degrees of Freedom in Defining and Testing Natural Language Understanding
Saku Sugawara
S. Tsugita
ELM
39
1
0
24 May 2023
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Natalie Shapira
Mosh Levy
S. Alavi
Xuhui Zhou
Yejin Choi
Yoav Goldberg
Maarten Sap
Vered Shwartz
LLMAG
ELM
33
118
0
24 May 2023
Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions
Zhihan Zhang
Wenhao Yu
Zheng Ning
Mingxuan Ju
Meng Jiang
31
4
0
23 May 2023
Previous
1
2
3
4
Next