Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.05463
Cited By
v1
v2 (latest)
Logic Traps in Evaluating Attribution Scores
12 September 2021
Yiming Ju
Yuanzhe Zhang
Zhao Yang
Zhongtao Jiang
Kang Liu
Jun Zhao
XAI
FAtt
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Logic Traps in Evaluating Attribution Scores"
40 / 40 papers shown
Title
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
122
4
0
15 Aug 2024
Perturbing Inputs for Fragile Interpretations in Deep Natural Language Processing
Sanchit Sinha
Hanjie Chen
Arshdeep Sekhon
Yangfeng Ji
Yanjun Qi
AAML
FAtt
60
41
0
11 Aug 2021
Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding
Philipp Koehn
FAtt
XAI
47
55
0
12 Apr 2021
Interpretation of NLP models through input marginalization
Siwon Kim
Jihun Yi
Eunji Kim
Sungroh Yoon
MILM
FAtt
78
60
0
27 Oct 2020
Gradient-based Analysis of NLP Models is Manipulable
Junlin Wang
Jens Tuyls
Eric Wallace
Sameer Singh
AAML
FAtt
68
60
0
12 Oct 2020
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers
Hanjie Chen
Yangfeng Ji
AAML
VLM
87
66
0
01 Oct 2020
How does this interaction affect me? Interpretable attribution for feature interactions
Michael Tsang
Sirisha Rambhatla
Yan Liu
FAtt
68
87
0
19 Jun 2020
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
Y. Hao
Li Dong
Furu Wei
Ke Xu
ViT
80
225
0
23 Apr 2020
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
Alon Jacovi
Yoav Goldberg
XAI
131
600
0
07 Apr 2020
Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection
Hanjie Chen
Guangtao Zheng
Yangfeng Ji
FAtt
95
95
0
04 Apr 2020
ERASER: A Benchmark to Evaluate Rationalized NLP Models
Jay DeYoung
Sarthak Jain
Nazneen Rajani
Eric P. Lehman
Caiming Xiong
R. Socher
Byron C. Wallace
130
638
0
08 Nov 2019
AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models
Eric Wallace
Jens Tuyls
Junlin Wang
Sanjay Subramanian
Matt Gardner
Sameer Singh
MILM
68
138
0
19 Sep 2019
Learning to Deceive with Attention-Based Explanations
Danish Pruthi
Mansi Gupta
Bhuwan Dhingra
Graham Neubig
Zachary Chase Lipton
80
193
0
17 Sep 2019
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment
Di Jin
Zhijing Jin
Qiufeng Wang
Peter Szolovits
SILM
AAML
185
1,086
0
27 Jul 2019
Interpretable Neural Predictions with Differentiable Binary Variables
Jasmijn Bastings
Wilker Aziz
Ivan Titov
82
214
0
20 May 2019
Attention is not Explanation
Sarthak Jain
Byron C. Wallace
FAtt
148
1,328
0
26 Feb 2019
Analysis Methods in Neural Language Processing: A Survey
Yonatan Belinkov
James R. Glass
95
558
0
21 Dec 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,175
0
11 Oct 2018
Learning for Single-Shot Confidence Calibration in Deep Neural Networks through Stochastic Inferences
Seonguk Seo
Paul Hongsuck Seo
Bohyung Han
FedML
UQCV
BDL
125
76
0
28 Sep 2018
L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data
Jianbo Chen
Le Song
Martin J. Wainwright
Michael I. Jordan
FAtt
TDI
115
216
0
08 Aug 2018
On the Robustness of Interpretability Methods
David Alvarez-Melis
Tommi Jaakkola
84
528
0
21 Jun 2018
RISE: Randomized Input Sampling for Explanation of Black-box Models
Vitali Petsiuk
Abir Das
Kate Saenko
FAtt
181
1,176
0
19 Jun 2018
Pathologies of Neural Models Make Interpretations Difficult
Shi Feng
Eric Wallace
Alvin Grissom II
Mohit Iyyer
Pedro Rodriguez
Jordan L. Boyd-Graber
AAML
FAtt
82
321
0
20 Apr 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,196
0
20 Apr 2018
Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs
W. James Murdoch
Peter J. Liu
Bin Yu
80
210
0
16 Jan 2018
Mitigating Adversarial Effects Through Randomization
Cihang Xie
Jianyu Wang
Zhishuai Zhang
Zhou Ren
Alan Yuille
AAML
115
1,061
0
06 Nov 2017
The (Un)reliability of saliency methods
Pieter-Jan Kindermans
Sara Hooker
Julius Adebayo
Maximilian Alber
Kristof T. Schütt
Sven Dähne
D. Erhan
Been Kim
FAtt
XAI
106
688
0
02 Nov 2017
Interpretation of Neural Networks is Fragile
Amirata Ghorbani
Abubakar Abid
James Zou
FAtt
AAML
133
870
0
29 Oct 2017
On Calibration of Modern Neural Networks
Chuan Guo
Geoff Pleiss
Yu Sun
Kilian Q. Weinberger
UQCV
299
5,862
0
14 Jun 2017
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg
Su-In Lee
FAtt
1.1K
22,018
0
22 May 2017
Ensemble Adversarial Training: Attacks and Defenses
Florian Tramèr
Alexey Kurakin
Nicolas Papernot
Ian Goodfellow
Dan Boneh
Patrick McDaniel
AAML
177
2,729
0
19 May 2017
RACE: Large-scale ReAding Comprehension Dataset From Examinations
Guokun Lai
Qizhe Xie
Hanxiao Liu
Yiming Yang
Eduard H. Hovy
ELM
193
1,357
0
15 Apr 2017
Learning Important Features Through Propagating Activation Differences
Avanti Shrikumar
Peyton Greenside
A. Kundaje
FAtt
203
3,881
0
10 Apr 2017
Axiomatic Attribution for Deep Networks
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
193
6,018
0
04 Mar 2017
Towards A Rigorous Science of Interpretable Machine Learning
Finale Doshi-Velez
Been Kim
XAI
FaML
406
3,813
0
28 Feb 2017
Understanding Neural Networks through Representation Erasure
Jiwei Li
Will Monroe
Dan Jurafsky
AAML
MILM
97
567
0
24 Dec 2016
Rationalizing Neural Predictions
Tao Lei
Regina Barzilay
Tommi Jaakkola
129
812
0
13 Jun 2016
The Limitations of Deep Learning in Adversarial Settings
Nicolas Papernot
Patrick McDaniel
S. Jha
Matt Fredrikson
Z. Berkay Celik
A. Swami
AAML
115
3,967
0
24 Nov 2015
Evaluating the visualization of what a Deep Neural Network has learned
Wojciech Samek
Alexander Binder
G. Montavon
Sebastian Lapuschkin
K. Müller
XAI
139
1,199
0
21 Sep 2015
Explaining and Harnessing Adversarial Examples
Ian Goodfellow
Jonathon Shlens
Christian Szegedy
AAML
GAN
282
19,121
0
20 Dec 2014
1