Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.03685
Cited By
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
7 April 2020
Alon Jacovi
Yoav Goldberg
XAI
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?"
50 / 381 papers shown
Title
Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning
Chong Li
Shaonan Wang
Yunhao Zhang
Jiajun Zhang
Chengqing Zong
38
4
0
16 Oct 2023
Faithfulness Measurable Masked Language Models
Andreas Madsen
Siva Reddy
Sarath Chandar
46
3
0
11 Oct 2023
Evaluating Explanation Methods for Vision-and-Language Navigation
Guanqi Chen
Lei Yang
Guanhua Chen
Jia Pan
XAI
23
0
0
10 Oct 2023
Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods
Jonathan Kamp
Lisa Beinborn
Antske Fokkens
FAtt
41
1
0
09 Oct 2023
Copy Suppression: Comprehensively Understanding an Attention Head
Callum McDougall
Arthur Conmy
Cody Rushing
Thomas McGrath
Neel Nanda
MILM
25
42
0
06 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
Anna Langedijk
Hosein Mohebbi
Gabriele Sarti
Willem H. Zuidema
Jaap Jumelet
32
10
0
05 Oct 2023
A Framework for Interpretability in Machine Learning for Medical Imaging
Alan Q. Wang
Batuhan K. Karaman
Heejong Kim
Jacob Rosenthal
Rachit Saluja
Sean I. Young
M. Sabuncu
AI4CE
17
11
0
02 Oct 2023
Quantifying the Plausibility of Context Reliance in Neural Machine Translation
Gabriele Sarti
Grzegorz Chrupala
Malvina Nissim
Arianna Bisazza
42
5
0
02 Oct 2023
Faithful Explanations of Black-box NLP Models Using LLM-generated Counterfactuals
Y. Gat
Nitay Calderon
Amir Feder
Alexander Chapanin
Amit Sharma
Roi Reichart
38
29
0
01 Oct 2023
Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
Qiushi Sun
Zhangyue Yin
Xiang Li
Zhiyong Wu
Xipeng Qiu
Lingpeng Kong
LRM
LLMAG
28
44
0
30 Sep 2023
Augment to Interpret: Unsupervised and Inherently Interpretable Graph Embeddings
Gregory Scafarto
Madalina Ciortan
Simon Tihon
Quentin Ferre
23
2
0
28 Sep 2023
GInX-Eval: Towards In-Distribution Evaluation of Graph Neural Network Explanations
Kenza Amara
Mennatallah El-Assady
Rex Ying
30
6
0
28 Sep 2023
May I Ask a Follow-up Question? Understanding the Benefits of Conversations in Neural Network Explainability
Tong Zhang
Xiaoyu Yang
Boyang Albert Li
30
3
0
25 Sep 2023
A Comprehensive Review on Financial Explainable AI
Wei Jie Yeo
Wihan van der Heever
Rui Mao
Min Zhang
Ranjan Satapathy
G. Mengaldo
XAI
AI4TS
32
15
0
21 Sep 2023
Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features
Eliana Pastor
Alkis Koudounas
Giuseppe Attanasio
Dirk Hovy
Elena Baralis
19
4
0
14 Sep 2023
Explainability for Large Language Models: A Survey
Haiyan Zhao
Hanjie Chen
Fan Yang
Ninghao Liu
Huiqi Deng
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Mengnan Du
LRM
34
415
0
02 Sep 2023
Large Language Models on the Chessboard: A Study on ChatGPT's Formal Language Comprehension and Complex Reasoning Skills
Mu-Tien Kuo
Chih-Chung Hsueh
Richard Tzong-Han Tsai
ELM
ReLM
LRM
21
6
0
29 Aug 2023
Goodhart's Law Applies to NLP's Explanation Benchmarks
Jennifer Hsia
Danish Pruthi
Aarti Singh
Zachary Chase Lipton
30
6
0
28 Aug 2023
Situated Natural Language Explanations
Zining Zhu
Hao Jiang
Jingfeng Yang
Sreyashi Nag
Chao Zhang
Jie Huang
Yifan Gao
Frank Rudzicz
Bing Yin
LRM
46
1
0
27 Aug 2023
Robust Infidelity: When Faithfulness Measures on Masked Language Models Are Misleading
Evan Crothers
H. Viktor
Nathalie Japkowicz
AAML
32
1
0
13 Aug 2023
Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?
Ari Holtzman
Peter West
Luke Zettlemoyer
AI4CE
34
14
0
31 Jul 2023
HAGRID: A Human-LLM Collaborative Dataset for Generative Information-Seeking with Attribution
Ehsan Kamalloo
A. Jafari
Xinyu Crystina Zhang
Nandan Thakur
Jimmy J. Lin
32
42
0
31 Jul 2023
The Co-12 Recipe for Evaluating Interpretable Part-Prototype Image Classifiers
Meike Nauta
Christin Seifert
34
11
0
26 Jul 2023
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Yanda Chen
Ruiqi Zhong
Narutatsu Ri
Chen Zhao
He He
Jacob Steinhardt
Zhou Yu
Kathleen McKeown
LRM
34
47
0
17 Jul 2023
Measuring Faithfulness in Chain-of-Thought Reasoning
Tamera Lanham
Anna Chen
Ansh Radhakrishnan
Benoit Steiner
Carson E. Denison
...
Zac Hatfield-Dodds
Jared Kaplan
J. Brauner
Sam Bowman
Ethan Perez
ReLM
LRM
25
169
0
17 Jul 2023
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Ansh Radhakrishnan
Karina Nguyen
Anna Chen
Carol Chen
Carson E. Denison
...
Zac Hatfield-Dodds
Jared Kaplan
J. Brauner
Sam Bowman
Ethan Perez
ReLM
LRM
HILM
40
85
0
17 Jul 2023
Stability Guarantees for Feature Attributions with Multiplicative Smoothing
Anton Xue
Rajeev Alur
Eric Wong
46
5
0
12 Jul 2023
DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications
Adam Ivankay
Mattia Rigotti
P. Frossard
OOD
MedIm
34
1
0
05 Jul 2023
Fixing confirmation bias in feature attribution methods via semantic match
Giovanni Cina
Daniel Fernandez-Llaneza
Ludovico Deponte
Nishant Mishra
Tabea E. Rober
Sandro Pezzelle
Iacer Calixto
Rob Goedhart
cS. .Ilker Birbil
FAtt
40
0
0
03 Jul 2023
Towards Explainable Evaluation Metrics for Machine Translation
Christoph Leiter
Piyawat Lertvittayakumjorn
M. Fomicheva
Wei-Ye Zhao
Yang Gao
Steffen Eger
ELM
38
13
0
22 Jun 2023
Evaluating the overall sensitivity of saliency-based explanation methods
Harshinee Sriram
Cristina Conati
AAML
XAI
FAtt
24
0
0
21 Jun 2023
Learning Locally Interpretable Rule Ensemble
Kentaro Kanamori
18
0
0
20 Jun 2023
A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation
Thomas Fel
Victor Boutin
Mazda Moayeri
Rémi Cadène
Louis Bethune
Léo Andéol
Mathieu Chalvidal
Thomas Serre
FAtt
16
51
0
11 Jun 2023
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting
Jie Wang
Qiushi Sun
Xiang Li
Ming Gao
ReLM
LRM
26
65
0
10 Jun 2023
Few Shot Rationale Generation using Self-Training with Dual Teachers
Aditya Srikanth Veerubhotla
Lahari Poddar
J. Yin
Gyuri Szarvas
S. Eswaran
LRM
18
2
0
05 Jun 2023
AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap
Q. V. Liao
J. Vaughan
58
159
0
02 Jun 2023
Theoretical Behavior of XAI Methods in the Presence of Suppressor Variables
Rick Wilming
Leo Kieslich
Benedict Clark
Stefan Haufe
27
9
0
02 Jun 2023
Being Right for Whose Right Reasons?
Terne Sasha Thorn Jakobsen
Laura Cabello
Anders Søgaard
39
10
0
01 Jun 2023
Efficient Shapley Values Estimation by Amortization for Text Classification
Chenghao Yang
Fan Yin
He He
Kai-Wei Chang
Xiaofei Ma
Bing Xiang
FAtt
VLM
35
4
0
31 May 2023
Perturbation-based Self-supervised Attention for Attention Bias in Text Classification
Hu Feng
Zhenxi Lin
Qianli Ma
30
4
0
25 May 2023
MaNtLE: Model-agnostic Natural Language Explainer
Rakesh R Menon
Kerem Zaman
Shashank Srivastava
FAtt
LRM
24
2
0
22 May 2023
Incorporating Attribution Importance for Improving Faithfulness Metrics
Zhixue Zhao
Nikolaos Aletras
26
13
0
17 May 2023
ConvXAI: Delivering Heterogeneous AI Explanations via Conversations to Support Human-AI Scientific Writing
Hua Shen
Huang Chieh-Yang
Tongshuang Wu
Ting-Hao 'Kenneth' Huang
23
37
0
16 May 2023
Consistent Multi-Granular Rationale Extraction for Explainable Multi-hop Fact Verification
Jiasheng Si
Yingjie Zhu
Deyu Zhou
AAML
52
3
0
16 May 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Miles Turpin
Julian Michael
Ethan Perez
Sam Bowman
ReLM
LRM
38
390
0
07 May 2023
Read it Twice: Towards Faithfully Interpretable Fact Verification by Revisiting Evidence
Xuming Hu
Zhaochen Hong
Zhijiang Guo
Lijie Wen
Philip S. Yu
HILM
37
15
0
02 May 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability
Arthur Conmy
Augustine N. Mavor-Parker
Aengus Lynch
Stefan Heimersheim
Adrià Garriga-Alonso
29
287
0
28 Apr 2023
Answering Questions by Meta-Reasoning over Multiple Chains of Thought
Ori Yoran
Tomer Wolfson
Ben Bogin
Uri Katz
Daniel Deutch
Jonathan Berant
ReLM
LRM
KELM
26
95
0
25 Apr 2023
VISION DIFFMASK: Faithful Interpretation of Vision Transformers with Differentiable Patch Masking
A. Nalmpantis
Apostolos Panagiotopoulos
John Gkountouras
Konstantinos Papakostas
Wilker Aziz
15
4
0
13 Apr 2023
Why is plausibility surprisingly problematic as an XAI criterion?
Weina Jin
Xiaoxiao Li
Ghassan Hamarneh
60
3
0
30 Mar 2023
Previous
1
2
3
4
5
6
7
8
Next