Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.07913
Cited By
Learning to Deceive with Attention-Based Explanations
17 September 2019
Danish Pruthi
Mansi Gupta
Bhuwan Dhingra
Graham Neubig
Zachary Chase Lipton
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning to Deceive with Attention-Based Explanations"
50 / 50 papers shown
Title
Regularization, Semi-supervision, and Supervision for a Plausible Attention-Based Explanation
Duc Hau Nguyen
Cyrielle Mallart
Guillaume Gravier
Pascale Sébillot
68
0
0
22 Jan 2025
Explanation Regularisation through the Lens of Attributions
Pedro Ferreira
Wilker Aziz
Ivan Titov
51
1
0
23 Jul 2024
Why bother with geometry? On the relevance of linear decompositions of Transformer embeddings
Timothee Mickus
Raúl Vázquez
27
2
0
10 Oct 2023
Evaluating Explanation Methods for Vision-and-Language Navigation
Guanqi Chen
Lei Yang
Guanhua Chen
Jia Pan
XAI
25
0
0
10 Oct 2023
Explaining How Transformers Use Context to Build Predictions
Javier Ferrando
Gerard I. Gállego
Ioannis Tsiamas
Marta R. Costa-jussá
37
32
0
21 May 2023
Faithful Chain-of-Thought Reasoning
Qing Lyu
Shreya Havaldar
Adam Stein
Li Zhang
D. Rao
Eric Wong
Marianna Apidianaki
Chris Callison-Burch
ReLM
LRM
49
209
0
31 Jan 2023
Tensions Between the Proxies of Human Values in AI
Teresa Datta
D. Nissani
Max Cembalest
Akash Khanna
Haley Massa
John P. Dickerson
39
2
0
14 Dec 2022
MEGAN: Multi-Explanation Graph Attention Network
Jonas Teufel
Luca Torresi
Patrick Reiser
Pascal Friederich
31
8
0
23 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers
Weiyan Xie
Xiao-hui Li
Caleb Chen Cao
Nevin L.Zhang
ViT
37
17
0
06 Nov 2022
Salience Allocation as Guidance for Abstractive Summarization
Fei Wang
Kaiqiang Song
Hongming Zhang
Lifeng Jin
Sangwoo Cho
Wenlin Yao
Xiaoyang Wang
Muhao Chen
Dong Yu
60
32
0
22 Oct 2022
StyLEx: Explaining Style Using Human Lexical Annotations
Shirley Anugrah Hayati
Kyumin Park
Dheeraj Rajagopal
Lyle Ungar
Dongyeop Kang
28
3
0
14 Oct 2022
On the Explainability of Natural Language Processing Deep Models
Julia El Zini
M. Awad
39
82
0
13 Oct 2022
Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-Making
Jakob Schoeffer
Maria De-Arteaga
Niklas Kuehl
FaML
55
46
0
23 Sep 2022
How to Dissect a Muppet: The Structure of Transformer Embedding Spaces
Timothee Mickus
Denis Paperno
Mathieu Constant
36
20
0
07 Jun 2022
Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps
Oren Barkan
Edan Hauon
Avi Caciularu
Ori Katz
Itzik Malkiel
Omri Armstrong
Noam Koenigstein
39
37
0
23 Apr 2022
ProtoTEx: Explaining Model Decisions with Prototype Tensors
Anubrata Das
Chitrank Gupta
Venelin Kovatchev
Matthew Lease
Junjie Li
39
27
0
11 Apr 2022
Interpretation of Black Box NLP Models: A Survey
Shivani Choudhary
N. Chatterjee
S. K. Saha
FAtt
34
10
0
31 Mar 2022
Measuring the Mixing of Contextual Information in the Transformer
Javier Ferrando
Gerard I. Gállego
Marta R. Costa-jussá
36
50
0
08 Mar 2022
POTATO: exPlainable infOrmation exTrAcTion framewOrk
Adam Kovacs
Kinga Gémes
Eszter Iklódi
Gábor Recski
38
4
0
31 Jan 2022
Counterfactual Explanations for Models of Code
Jürgen Cito
Işıl Dillig
V. Murali
S. Chandra
AAML
LRM
32
48
0
10 Nov 2021
Revisiting Methods for Finding Influential Examples
Karthikeyan K
Anders Søgaard
TDI
22
30
0
08 Nov 2021
Understanding Interlocking Dynamics of Cooperative Rationalization
Mo Yu
Yang Zhang
Shiyu Chang
Tommi Jaakkola
29
41
0
26 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
24
45
0
20 Oct 2021
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining
Andreas Madsen
Nicholas Meade
Vaibhav Adlakha
Siva Reddy
111
35
0
15 Oct 2021
Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models
Tianlu Wang
Rohit Sridhar
Diyi Yang
Xuezhi Wang
AAML
120
72
0
14 Oct 2021
Automated and Explainable Ontology Extension Based on Deep Learning: A Case Study in the Chemical Domain
A. Memariani
Martin Glauer
Fabian Neuhaus
Till Mossakowski
Janna Hastings
36
5
0
19 Sep 2021
Diagnostics-Guided Explanation Generation
Pepa Atanasova
J. Simonsen
Christina Lioma
Isabelle Augenstein
LRM
FAtt
43
6
0
08 Sep 2021
Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience
G. Chrysostomou
Nikolaos Aletras
37
16
0
31 Aug 2021
DuTrust: A Sentiment Analysis Dataset for Trustworthiness Evaluation
Lijie Wang
Hao Liu
Shu-ping Peng
Hongxuan Tang
Xinyan Xiao
Ying-Cong Chen
Hua Wu
Haifeng Wang
25
5
0
30 Aug 2021
A Survey on Automated Fact-Checking
Zhijiang Guo
M. Schlichtkrull
Andreas Vlachos
34
460
0
26 Aug 2021
Zorro: Valid, Sparse, and Stable Explanations in Graph Neural Networks
Thorben Funke
Megha Khosla
Mandeep Rathee
Avishek Anand
FAtt
28
38
0
18 May 2021
Collaborative Graph Learning with Auxiliary Text for Temporal Event Prediction in Healthcare
Chang Lu
Chandan K. Reddy
Prithwish Chakraborty
Samantha Kleinberg
Yue Ning
26
58
0
16 May 2021
How Reliable are Model Diagnostics?
V. Aribandi
Yi Tay
Donald Metzler
19
19
0
12 May 2021
Rationalization through Concepts
Diego Antognini
Boi Faltings
FAtt
27
19
0
11 May 2021
Which transformer architecture fits my data? A vocabulary bottleneck in self-attention
Noam Wies
Yoav Levine
Daniel Jannai
Amnon Shashua
40
20
0
09 May 2021
Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification
G. Chrysostomou
Nikolaos Aletras
32
37
0
06 May 2021
Do Feature Attribution Methods Correctly Attribute Features?
Yilun Zhou
Serena Booth
Marco Tulio Ribeiro
J. Shah
FAtt
XAI
40
132
0
27 Apr 2021
MeSIN: Multilevel Selective and Interactive Network for Medication Recommendation
Yang An
Liang Zhang
Mao You
Xueqing Tian
Bo Jin
Xiaopeng Wei
6
18
0
22 Apr 2021
Making Attention Mechanisms More Robust and Interpretable with Virtual Adversarial Training
Shunsuke Kitada
Hitoshi Iyatomi
AAML
33
8
0
18 Apr 2021
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
36
16
0
12 Jan 2021
Polyjuice: Generating Counterfactuals for Explaining, Evaluating, and Improving Models
Tongshuang Wu
Marco Tulio Ribeiro
Jeffrey Heer
Daniel S. Weld
48
244
0
01 Jan 2021
Deep-HOSeq: Deep Higher Order Sequence Fusion for Multimodal Sentiment Analysis
Sunny Verma
Jiwei Wang
Zhefeng Ge
Rujia Shen
Fan Jin
Yang Wang
Fang Chen
Wei Liu
29
20
0
16 Oct 2020
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings
Katja Filippova
XAI
LRM
64
175
0
12 Oct 2020
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig
Ali Madani
Lav Varshney
Caiming Xiong
R. Socher
Nazneen Rajani
34
289
0
26 Jun 2020
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
30
45
0
22 Jun 2020
Staying True to Your Word: (How) Can Attention Become Explanation?
Martin Tutek
Jan Snajder
16
27
0
19 May 2020
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
65
779
0
02 May 2020
Sequential Interpretability: Methods, Applications, and Future Direction for Understanding Deep Learning Models in the Context of Sequential Data
B. Shickel
Parisa Rashidi
AI4TS
33
17
0
27 Apr 2020
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
Alon Jacovi
Yoav Goldberg
XAI
48
571
0
07 Apr 2020
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
30
187
0
12 Aug 2019
1