Contrastive Explanations for Model Interpretability

2 March 2021

Yejin Choi

Papers citing "Contrastive Explanations for Model Interpretability"

50 / 62 papers shown

Title
Comparative Explanations: Explanation Guided Decision Making for Human-in-the-Loop Preference Selection Tanmay Chakraborty Christian Wirth Christin Seifert 28 0 0 01 Apr 2025
Conceptual Contrastive Edits in Textual and Vision-Language Retrieval Maria Lymperaiou Giorgos Stamou VLM 55 0 0 01 Mar 2025
Comparing zero-shot self-explanations with human rationales in text classification Stephanie Brandl Oliver Eberle 62 0 0 24 Feb 2025
Is Conversational XAI All You Need? Human-AI Decision Making With a Conversational XAI Assistant Gaole He Nilay Aishwarya U. Gadiraju 40 6 0 29 Jan 2025
A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers Stephen McAleese Mark Keane 33 0 0 04 Nov 2024
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills Zana Buçinca S. Swaroop Amanda E. Paluch Finale Doshi-Velez Krzysztof Z. Gajos 48 2 0 05 Oct 2024
CELL your Model: Contrastive Explanations for Large Language Models Ronny Luss Erik Miehling Amit Dhurandhar 47 0 0 17 Jun 2024
Unveiling and Manipulating Prompt Influence in Large Language Models Zijian Feng Hanzhang Zhou Zixiao Zhu Junlang Qian Kezhi Mao 37 2 0 20 May 2024
ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization Hong Nguyen H. Nguyen Melinda Y. Chang Hieu H. Pham Shrikanth Narayanan Michael Pazzani 27 0 0 29 Apr 2024
Interactive Prompt Debugging with Sequence Salience Ian Tenney Ryan Mullins Bin Du Shree Pandya Minsuk Kahng Lucas Dixon LRM 32 1 0 11 Apr 2024
LLM Attributor: Interactive Visual Attribution for LLM Generation Seongmin Lee Zijie J. Wang Aishwarya Chakravarthy Alec Helbling Sheng-Hsuan Peng Mansi Phute Duen Horng Chau Minsuk Kahng 38 3 0 01 Apr 2024
Heterogeneous Contrastive Learning for Foundation Models and Beyond Lecheng Zheng Baoyu Jing Zihao Li Hanghang Tong Jingrui He VLM 38 19 0 30 Mar 2024
Visual Analytics for Fine-grained Text Classification Models and Datasets Munkhtulga Battogtokh Y. Xing Cosmin Davidescu Alfie Abdul-Rahman Michael Luck Rita Borgo 31 0 0 21 Mar 2024
RORA: Robust Free-Text Rationale Evaluation Zhengping Jiang Yining Lu Hanjie Chen Daniel Khashabi Benjamin Van Durme Anqi Liu 47 1 0 28 Feb 2024
Explaining Probabilistic Models with Distributional Values Luca Franceschi Michele Donini Cédric Archambeau Matthias Seeger FAtt 37 2 0 15 Feb 2024
Observable Propagation: Uncovering Feature Vectors in Transformers Jacob Dunefsky Arman Cohan 35 2 0 26 Dec 2023
Navigating the Structured What-If Spaces: Counterfactual Generation via Structured Diffusion Nishtha Madaan Srikanta J. Bedathur DiffM 38 0 0 21 Dec 2023
TextGenSHAP: Scalable Post-hoc Explanations in Text Generation with Long Documents James Enouen Hootan Nakhost Sayna Ebrahimi Sercan Ö. Arik Yan Liu Tomas Pfister 33 4 0 03 Dec 2023
XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making Zichen Chen Jianda Chen Mitali Gaidhani Ambuj K. Singh Misha Sra 32 4 0 15 Nov 2023
DistillCSE: Distilled Contrastive Learning for Sentence Embeddings Jiahao Xu Wei Shao Lihui Chen Lemao Liu FedML 29 4 0 20 Oct 2023
Rather a Nurse than a Physician -- Contrastive Explanations under Investigation Oliver Eberle Ilias Chalkidis Laura Cabello Stephanie Brandl 24 9 0 18 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT Stefan Arnold Nils Kemmerzell Annika Schreiner 25 0 0 17 Oct 2023
LIPEx-Locally Interpretable Probabilistic Explanations-To Look Beyond The True Class Hongbo Zhu Angelo Cangelosi Procheta Sen Anirbit Mukherjee FAtt 42 0 0 07 Oct 2023
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features Eliana Pastor Alkis Koudounas Giuseppe Attanasio Dirk Hovy Elena Baralis 16 4 0 14 Sep 2023
A Geometric Notion of Causal Probing Clément Guerner Anej Svete Tianyu Liu Alex Warstadt Ryan Cotterell LLMSV 38 12 0 27 Jul 2023
Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions Skyler Wu Eric Meng Shen Charumathi Badrinath Jiaqi Ma Himabindu Lakkaraju LRM 38 26 0 25 Jul 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations Yanda Chen Ruiqi Zhong Narutatsu Ri Chen Zhao He He Jacob Steinhardt Zhou Yu Kathleen McKeown LRM 26 47 0 17 Jul 2023
CLIMAX: An exploration of Classifier-Based Contrastive Explanations Praharsh Nanavati Ranjitha Prasad 37 0 0 02 Jul 2023
Two-Stage Holistic and Contrastive Explanation of Image Classification Weiyan Xie Xiao-hui Li Zhi Lin Leonard K. M. Poon Caleb Chen Cao N. Zhang 24 2 0 10 Jun 2023
Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures Jakob Prange Emmanuele Chersoni 32 0 0 30 May 2023
Faithfulness Tests for Natural Language Explanations Pepa Atanasova Oana-Maria Camburu Christina Lioma Thomas Lukasiewicz J. Simonsen Isabelle Augenstein FAtt 29 59 0 29 May 2023
Learning to Generalize for Cross-domain QA Yingjie Niu Linyi Yang Ruihai Dong Yue Zhang 18 6 0 14 May 2023
Distinguish Before Answer: Generating Contrastive Explanation as Knowledge for Commonsense Question Answering Qianglong Chen Guohai Xu Mingshi Yan Ji Zhang Fei Huang Luo Si Yin Zhang 18 9 0 14 May 2023
Surfacing Biases in Large Language Models using Contrastive Input Decoding G. Yona Or Honovich Itay Laish Roee Aharoni 27 11 0 12 May 2023
Explaining Model Confidence Using Counterfactuals Thao Le Tim Miller Ronal Singh L. Sonenberg 19 2 0 10 Mar 2023
Signed Directed Graph Contrastive Learning with Laplacian Augmentation Taewook Ko Y. Choi Chong-Kwon Kim 26 3 0 12 Jan 2023
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning O. Yu. Golovneva Moya Chen Spencer Poff Martin Corredor Luke Zettlemoyer Maryam Fazel-Zarandi Asli Celikyilmaz ReLM LRM 20 137 0 15 Dec 2022
CCPrefix: Counterfactual Contrastive Prefix-Tuning for Many-Class Classification Heng Chang Canran Xu Guodong Long Tao Shen Chongyang Tao Jing Jiang 38 1 0 11 Nov 2022
A General Search-based Framework for Generating Textual Counterfactual Explanations Daniel Gilo Shaul Markovitch LRM 34 0 0 01 Nov 2022
Does Self-Rationalization Improve Robustness to Spurious Correlations? Alexis Ross Matthew E. Peters Ana Marasović LRM 21 11 0 24 Oct 2022
Lexical Generalization Improves with Larger Models and Longer Training Elron Bandel Yoav Goldberg Yanai Elazar 49 6 0 23 Oct 2022
Log-linear Guardedness and its Implications Shauli Ravfogel Yoav Goldberg Ryan Cotterell 28 2 0 18 Oct 2022
Beyond Model Interpretability: On the Faithfulness and Adversarial Robustness of Contrastive Textual Explanations Julia El Zini M. Awad AAML 23 2 0 17 Oct 2022
Contrastive Corpus Attribution for Explaining Representations Christy Lin Hugh Chen Chanwoo Kim Su-In Lee SSL 19 8 0 30 Sep 2022
Towards Faithful Model Explanation in NLP: A Survey Qing Lyu Marianna Apidianaki Chris Callison-Burch XAI 109 107 0 22 Sep 2022
Policy Optimization with Sparse Global Contrastive Explanations Jiayu Yao S. Parbhoo Weiwei Pan Finale Doshi-Velez OffRL 14 1 0 13 Jul 2022
Probing Classifiers are Unreliable for Concept Removal and Detection Abhinav Kumar Chenhao Tan Amit Sharma AAML 28 20 0 08 Jul 2022
Improving Model Understanding and Trust with Counterfactual Explanations of Model Confidence Thao Le Tim Miller Ronal Singh L. Sonenberg 14 9 0 06 Jun 2022
Investigating the Benefits of Free-Form Rationales Jiao Sun Swabha Swayamdipta Jonathan May Xuezhe Ma 18 14 0 25 May 2022
Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations Jaehun Jung Lianhui Qin Sean Welleck Faeze Brahman Chandra Bhagavatula Ronan Le Bras Yejin Choi ReLM LRM 223 190 0 24 May 2022