Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.04629
Cited By
Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation
9 December 2022
Julius Adebayo
M. Muelly
H. Abelson
Been Kim
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation"
50 / 58 papers shown
Title
In defence of post-hoc explanations in medical AI
Joshua Hatherley
Lauritz Munch
Jens Christian Bjerring
64
0
0
29 Apr 2025
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin
Anton Xue
Weiqiu You
Surbhi Goel
Eric Wong
92
0
0
18 Apr 2025
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Shreyash Arya
Sukrut Rao
Moritz Bohle
Bernt Schiele
143
3
0
28 Jan 2025
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Hugo Porta
Emanuele Dalsasso
Diego Marcos
D. Tuia
223
0
0
14 Sep 2024
"Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
Jasmijn Bastings
Sebastian Ebert
Polina Zablotskaia
Anders Sandholm
Katja Filippova
130
78
0
14 Nov 2021
Finding and Fixing Spurious Patterns with Explanations
Gregory Plumb
Marco Tulio Ribeiro
Ameet Talwalkar
69
42
0
03 Jun 2021
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores
Giang Nguyen
Daeyoung Kim
Anh Totti Nguyen
FAtt
105
89
0
31 May 2021
Sanity Simulations for Saliency Methods
Joon Sik Kim
Gregory Plumb
Ameet Talwalkar
FAtt
60
17
0
13 May 2021
Do Feature Attribution Methods Correctly Attribute Features?
Yilun Zhou
Serena Booth
Marco Tulio Ribeiro
J. Shah
FAtt
XAI
71
133
0
27 Apr 2021
Do Input Gradients Highlight Discriminative Features?
Harshay Shah
Prateek Jain
Praneeth Netrapalli
AAML
FAtt
57
59
0
25 Feb 2021
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging
Han Guo
Nazneen Rajani
Peter Hase
Joey Tianyi Zhou
Caiming Xiong
TDI
79
112
0
31 Dec 2020
Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately
Fereshte Khani
Percy Liang
FaML
49
65
0
07 Dec 2020
Debugging Tests for Model Explanations
Julius Adebayo
M. Muelly
Ilaria Liccardi
Been Kim
FAtt
64
181
0
10 Nov 2020
Understanding the Failure Modes of Out-of-Distribution Generalization
Vaishnavh Nagarajan
Anders Andreassen
Behnam Neyshabur
OOD
OODD
48
177
0
29 Oct 2020
Now You See Me (CME): Concept-based Model Extraction
Dmitry Kazhdan
B. Dimanov
M. Jamnik
Pietro Lio
Adrian Weller
46
75
0
25 Oct 2020
How Useful Are the Machine-Generated Interpretations to General Users? A Human Evaluation on Guessing the Incorrectly Predicted Labels
Hua Shen
Ting-Hao 'Kenneth' Huang
FAtt
HAI
54
56
0
26 Aug 2020
Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction
Eric Chu
D. Roy
Jacob Andreas
FAtt
LRM
49
71
0
23 Jul 2020
Debiasing Concept-based Explanations with Causal Analysis
M. T. Bahadori
David Heckerman
FAtt
CML
55
39
0
22 Jul 2020
Fairwashing Explanations with Off-Manifold Detergent
Christopher J. Anders
Plamen Pasliev
Ann-Kathrin Dombrowski
K. Müller
Pan Kessel
FAtt
FaML
42
97
0
20 Jul 2020
Concept Bottleneck Models
Pang Wei Koh
Thao Nguyen
Y. S. Tang
Stephen Mussmann
Emma Pierson
Been Kim
Percy Liang
94
818
0
09 Jul 2020
Influence Functions in Deep Learning Are Fragile
S. Basu
Phillip E. Pope
Soheil Feizi
TDI
99
230
0
25 Jun 2020
Noise or Signal: The Role of Image Backgrounds in Object Recognition
Kai Y. Xiao
Logan Engstrom
Andrew Ilyas
Aleksander Madry
131
387
0
17 Jun 2020
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions
Xiaochuang Han
Byron C. Wallace
Yulia Tsvetkov
MILM
FAtt
AAML
TDI
55
171
0
14 May 2020
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
Shiori Sagawa
Aditi Raghunathan
Pang Wei Koh
Percy Liang
182
379
0
09 May 2020
Shortcut Learning in Deep Neural Networks
Robert Geirhos
J. Jacobsen
Claudio Michaelis
R. Zemel
Wieland Brendel
Matthias Bethge
Felix Wichmann
198
2,044
0
16 Apr 2020
Estimating Training Data Influence by Tracing Gradient Descent
G. Pruthi
Frederick Liu
Mukund Sundararajan
Satyen Kale
TDI
68
404
0
19 Feb 2020
Concept Whitening for Interpretable Image Recognition
Zhi Chen
Yijie Bei
Cynthia Rudin
FAtt
63
320
0
05 Feb 2020
Evaluating Saliency Map Explanations for Convolutional Neural Networks: A User Study
Ahmed Alqaraawi
M. Schuessler
Philipp Weiß
Enrico Costanza
N. Bianchi-Berthouze
AAML
FAtt
XAI
61
200
0
03 Feb 2020
Sanity Checks for Saliency Metrics
Richard J. Tomsett
Daniel Harborne
Supriyo Chakraborty
Prudhvi K. Gurram
Alun D. Preece
XAI
67
169
0
29 Nov 2019
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
85
1,236
0
20 Nov 2019
"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations
Himabindu Lakkaraju
Osbert Bastani
56
254
0
15 Nov 2019
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods
Dylan Slack
Sophie Hilgard
Emily Jia
Sameer Singh
Himabindu Lakkaraju
FAtt
AAML
MLAU
66
817
0
06 Nov 2019
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Chih-Kuan Yeh
Been Kim
Sercan O. Arik
Chun-Liang Li
Tomas Pfister
Pradeep Ravikumar
FAtt
210
304
0
17 Oct 2019
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
Laura Rieger
Chandan Singh
W. James Murdoch
Bin Yu
FAtt
78
214
0
30 Sep 2019
Improving performance of deep learning models with axiomatic attribution priors and expected gradients
G. Erion
Joseph D. Janizek
Pascal Sturmfels
Scott M. Lundberg
Su-In Lee
OOD
BDL
FAtt
52
81
0
25 Jun 2019
Explanations can be manipulated and geometry is to blame
Ann-Kathrin Dombrowski
Maximilian Alber
Christopher J. Anders
M. Ackermann
K. Müller
Pan Kessel
AAML
FAtt
78
330
0
19 Jun 2019
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Sebastian Lapuschkin
S. Wäldchen
Alexander Binder
G. Montavon
Wojciech Samek
K. Müller
84
1,009
0
26 Feb 2019
Transfusion: Understanding Transfer Learning for Medical Imaging
M. Raghu
Chiyuan Zhang
Jon M. Kleinberg
Samy Bengio
MedIm
75
982
0
14 Feb 2019
Fooling Neural Network Interpretations via Adversarial Model Manipulation
Juyeon Heo
Sunghwan Joo
Taesup Moon
AAML
FAtt
88
202
0
06 Feb 2019
ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
Robert Geirhos
Patricia Rubisch
Claudio Michaelis
Matthias Bethge
Felix Wichmann
Wieland Brendel
96
2,662
0
29 Nov 2018
Representer Point Selection for Explaining Deep Neural Networks
Chih-Kuan Yeh
Joon Sik Kim
Ian En-Hsu Yen
Pradeep Ravikumar
TDI
64
251
0
23 Nov 2018
Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables
Giovanni Sutanto
J. Zech
Luke Oakden-Rayner
Yevgen Chebotar
Manway Liu
William Gale
M. McConnell
Ankur Handa
Thomas M. Snyder
Dieter Fox
AI4CE
OOD
74
244
0
08 Nov 2018
Sanity Checks for Saliency Maps
Julius Adebayo
Justin Gilmer
M. Muelly
Ian Goodfellow
Moritz Hardt
Been Kim
FAtt
AAML
XAI
123
1,963
0
08 Oct 2018
A Benchmark for Interpretability Methods in Deep Neural Networks
Sara Hooker
D. Erhan
Pieter-Jan Kindermans
Been Kim
FAtt
UQCV
98
681
0
28 Jun 2018
A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations
Weili Nie
Yang Zhang
Ankit B. Patel
FAtt
120
151
0
18 May 2018
Manipulating and Measuring Model Interpretability
Forough Poursabzi-Sangdeh
D. Goldstein
Jake M. Hofman
Jennifer Wortman Vaughan
Hanna M. Wallach
84
697
0
21 Feb 2018
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
Been Kim
Martin Wattenberg
Justin Gilmer
Carrie J. Cai
James Wexler
F. Viégas
Rory Sayres
FAtt
199
1,837
0
30 Nov 2017
Interpretation of Neural Networks is Fragile
Amirata Ghorbani
Abubakar Abid
James Zou
FAtt
AAML
124
865
0
29 Oct 2017
SmoothGrad: removing noise by adding noise
D. Smilkov
Nikhil Thorat
Been Kim
F. Viégas
Martin Wattenberg
FAtt
ODL
199
2,221
0
12 Jun 2017
Network Dissection: Quantifying Interpretability of Deep Visual Representations
David Bau
Bolei Zhou
A. Khosla
A. Oliva
Antonio Torralba
MILM
FAtt
132
1,514
1
19 Apr 2017
1
2
Next