Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?

7 April 2020

Papers citing "Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?"

50 / 381 papers shown

Title
Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning Chong Li Shaonan Wang Yunhao Zhang Jiajun Zhang Chengqing Zong 38 4 0 16 Oct 2023
Faithfulness Measurable Masked Language Models Andreas Madsen Siva Reddy Sarath Chandar 46 3 0 11 Oct 2023
Evaluating Explanation Methods for Vision-and-Language Navigation Guanqi Chen Lei Yang Guanhua Chen Jia Pan XAI 23 0 0 10 Oct 2023
Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods Jonathan Kamp Lisa Beinborn Antske Fokkens FAtt 41 1 0 09 Oct 2023
Copy Suppression: Comprehensively Understanding an Attention Head Callum McDougall Arthur Conmy Cody Rushing Thomas McGrath Neel Nanda MILM 25 42 0 06 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers Anna Langedijk Hosein Mohebbi Gabriele Sarti Willem H. Zuidema Jaap Jumelet 32 10 0 05 Oct 2023
A Framework for Interpretability in Machine Learning for Medical Imaging Alan Q. Wang Batuhan K. Karaman Heejong Kim Jacob Rosenthal Rachit Saluja Sean I. Young M. Sabuncu AI4CE 17 11 0 02 Oct 2023
Quantifying the Plausibility of Context Reliance in Neural Machine Translation Gabriele Sarti Grzegorz Chrupala Malvina Nissim Arianna Bisazza 42 5 0 02 Oct 2023
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals Y. Gat Nitay Calderon Amir Feder Alexander Chapanin Amit Sharma Roi Reichart 38 29 0 01 Oct 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration Qiushi Sun Zhangyue Yin Xiang Li Zhiyong Wu Xipeng Qiu Lingpeng Kong LRM LLMAG 28 44 0 30 Sep 2023
Augment to Interpret: Unsupervised and Inherently Interpretable Graph Embeddings Gregory Scafarto Madalina Ciortan Simon Tihon Quentin Ferre 23 2 0 28 Sep 2023
GInX-Eval: Towards In-Distribution Evaluation of Graph Neural Network Explanations Kenza Amara Mennatallah El-Assady Rex Ying 30 6 0 28 Sep 2023
May I Ask a Follow-up Question? Understanding the Benefits of Conversations in Neural Network Explainability Tong Zhang Xiaoyu Yang Boyang Albert Li 30 3 0 25 Sep 2023
A Comprehensive Review on Financial Explainable AI Wei Jie Yeo Wihan van der Heever Rui Mao Min Zhang Ranjan Satapathy G. Mengaldo XAI AI4TS 32 15 0 21 Sep 2023
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features Eliana Pastor Alkis Koudounas Giuseppe Attanasio Dirk Hovy Elena Baralis 19 4 0 14 Sep 2023
Explainability for Large Language Models: A Survey Haiyan Zhao Hanjie Chen Fan Yang Ninghao Liu Huiqi Deng Hengyi Cai Shuaiqiang Wang Dawei Yin Mengnan Du LRM 34 415 0 02 Sep 2023
Large Language Models on the Chessboard: A Study on ChatGPT's Formal Language Comprehension and Complex Reasoning Skills Mu-Tien Kuo Chih-Chung Hsueh Richard Tzong-Han Tsai ELM ReLM LRM 21 6 0 29 Aug 2023
Goodhart's Law Applies to NLP's Explanation Benchmarks Jennifer Hsia Danish Pruthi Aarti Singh Zachary Chase Lipton 30 6 0 28 Aug 2023
Situated Natural Language Explanations Zining Zhu Hao Jiang Jingfeng Yang Sreyashi Nag Chao Zhang Jie Huang Yifan Gao Frank Rudzicz Bing Yin LRM 46 1 0 27 Aug 2023
Robust Infidelity: When Faithfulness Measures on Masked Language Models Are Misleading Evan Crothers H. Viktor Nathalie Japkowicz AAML 32 1 0 13 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior? Ari Holtzman Peter West Luke Zettlemoyer AI4CE 34 14 0 31 Jul 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution Ehsan Kamalloo A. Jafari Xinyu Crystina Zhang Nandan Thakur Jimmy J. Lin 32 42 0 31 Jul 2023
The Co-12 Recipe for Evaluating Interpretable Part-Prototype Image Classifiers Meike Nauta Christin Seifert 34 11 0 26 Jul 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations Yanda Chen Ruiqi Zhong Narutatsu Ri Chen Zhao He He Jacob Steinhardt Zhou Yu Kathleen McKeown LRM 34 47 0 17 Jul 2023
Measuring Faithfulness in Chain-of-Thought Reasoning Tamera Lanham Anna Chen Ansh Radhakrishnan Benoit Steiner Carson E. Denison ... Zac Hatfield-Dodds Jared Kaplan J. Brauner Sam Bowman Ethan Perez ReLM LRM 25 169 0 17 Jul 2023
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning Ansh Radhakrishnan Karina Nguyen Anna Chen Carol Chen Carson E. Denison ... Zac Hatfield-Dodds Jared Kaplan J. Brauner Sam Bowman Ethan Perez ReLM LRM HILM 40 85 0 17 Jul 2023
Stability Guarantees for Feature Attributions with Multiplicative Smoothing Anton Xue Rajeev Alur Eric Wong 46 5 0 12 Jul 2023
DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications Adam Ivankay Mattia Rigotti P. Frossard OOD MedIm 34 1 0 05 Jul 2023
Fixing confirmation bias in feature attribution methods via semantic match Giovanni Cina Daniel Fernandez-Llaneza Ludovico Deponte Nishant Mishra Tabea E. Rober Sandro Pezzelle Iacer Calixto Rob Goedhart cS. .Ilker Birbil FAtt 40 0 0 03 Jul 2023
Towards Explainable Evaluation Metrics for Machine Translation Christoph Leiter Piyawat Lertvittayakumjorn M. Fomicheva Wei-Ye Zhao Yang Gao Steffen Eger ELM 38 13 0 22 Jun 2023
Evaluating the overall sensitivity of saliency-based explanation methods Harshinee Sriram Cristina Conati AAML XAI FAtt 24 0 0 21 Jun 2023
Learning Locally Interpretable Rule Ensemble Kentaro Kanamori 18 0 0 20 Jun 2023
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation Thomas Fel Victor Boutin Mazda Moayeri Rémi Cadène Louis Bethune Léo Andéol Mathieu Chalvidal Thomas Serre FAtt 16 51 0 11 Jun 2023
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting Jie Wang Qiushi Sun Xiang Li Ming Gao ReLM LRM 26 65 0 10 Jun 2023
Few Shot Rationale Generation using Self-Training with Dual Teachers Aditya Srikanth Veerubhotla Lahari Poddar J. Yin Gyuri Szarvas S. Eswaran LRM 18 2 0 05 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap Q. V. Liao J. Vaughan 58 159 0 02 Jun 2023
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables Rick Wilming Leo Kieslich Benedict Clark Stefan Haufe 27 9 0 02 Jun 2023
Being Right for Whose Right Reasons? Terne Sasha Thorn Jakobsen Laura Cabello Anders Søgaard 39 10 0 01 Jun 2023
Efficient Shapley Values Estimation by Amortization for Text Classification Chenghao Yang Fan Yin He He Kai-Wei Chang Xiaofei Ma Bing Xiang FAtt VLM 35 4 0 31 May 2023
Perturbation-based Self-supervised Attention for Attention Bias in Text Classification Hu Feng Zhenxi Lin Qianli Ma 30 4 0 25 May 2023
MaNtLE: Model-agnostic Natural Language Explainer Rakesh R Menon Kerem Zaman Shashank Srivastava FAtt LRM 24 2 0 22 May 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics Zhixue Zhao Nikolaos Aletras 26 13 0 17 May 2023
ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing Hua Shen Huang Chieh-Yang Tongshuang Wu Ting-Hao 'Kenneth' Huang 23 37 0 16 May 2023
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop Fact Verification Jiasheng Si Yingjie Zhu Deyu Zhou AAML 52 3 0 16 May 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting Miles Turpin Julian Michael Ethan Perez Sam Bowman ReLM LRM 38 390 0 07 May 2023
Read it Twice: Towards Faithfully Interpretable Fact Verification by Revisiting Evidence Xuming Hu Zhaochen Hong Zhijiang Guo Lijie Wen Philip S. Yu HILM 37 15 0 02 May 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability Arthur Conmy Augustine N. Mavor-Parker Aengus Lynch Stefan Heimersheim Adrià Garriga-Alonso 29 287 0 28 Apr 2023
Answering Questions by Meta-Reasoning over Multiple Chains of Thought Ori Yoran Tomer Wolfson Ben Bogin Uri Katz Daniel Deutch Jonathan Berant ReLM LRM KELM 26 95 0 25 Apr 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking A. Nalmpantis Apostolos Panagiotopoulos John Gkountouras Konstantinos Papakostas Wilker Aziz 15 4 0 13 Apr 2023
Why is plausibility surprisingly problematic as an XAI criterion? Weina Jin Xiaoxiao Li Ghassan Hamarneh 60 3 0 30 Mar 2023