ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2212.04629
  4. Cited By
Post hoc Explanations may be Ineffective for Detecting Unknown Spurious
  Correlation

Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation

9 December 2022
Julius Adebayo
M. Muelly
H. Abelson
Been Kim
ArXivPDFHTML

Papers citing "Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation"

50 / 58 papers shown
Title
In defence of post-hoc explanations in medical AI
In defence of post-hoc explanations in medical AI
Joshua Hatherley
Lauritz Munch
Jens Christian Bjerring
64
0
0
29 Apr 2025
Probabilistic Stability Guarantees for Feature Attributions
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin
Anton Xue
Weiqiu You
Surbhi Goel
Eric Wong
92
0
0
18 Apr 2025
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
B-cosification: Transforming Deep Neural Networks to be Inherently Interpretable
Shreyash Arya
Sukrut Rao
Moritz Bohle
Bernt Schiele
143
3
0
28 Jan 2025
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation
Hugo Porta
Emanuele Dalsasso
Diego Marcos
D. Tuia
223
0
0
14 Sep 2024
"Will You Find These Shortcuts?" A Protocol for Evaluating the
  Faithfulness of Input Salience Methods for Text Classification
"Will You Find These Shortcuts?" A Protocol for Evaluating the Faithfulness of Input Salience Methods for Text Classification
Jasmijn Bastings
Sebastian Ebert
Polina Zablotskaia
Anders Sandholm
Katja Filippova
130
78
0
14 Nov 2021
Finding and Fixing Spurious Patterns with Explanations
Finding and Fixing Spurious Patterns with Explanations
Gregory Plumb
Marco Tulio Ribeiro
Ameet Talwalkar
69
42
0
03 Jun 2021
The effectiveness of feature attribution methods and its correlation
  with automatic evaluation scores
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores
Giang Nguyen
Daeyoung Kim
Anh Totti Nguyen
FAtt
105
89
0
31 May 2021
Sanity Simulations for Saliency Methods
Sanity Simulations for Saliency Methods
Joon Sik Kim
Gregory Plumb
Ameet Talwalkar
FAtt
60
17
0
13 May 2021
Do Feature Attribution Methods Correctly Attribute Features?
Do Feature Attribution Methods Correctly Attribute Features?
Yilun Zhou
Serena Booth
Marco Tulio Ribeiro
J. Shah
FAtt
XAI
71
133
0
27 Apr 2021
Do Input Gradients Highlight Discriminative Features?
Do Input Gradients Highlight Discriminative Features?
Harshay Shah
Prateek Jain
Praneeth Netrapalli
AAML
FAtt
57
59
0
25 Feb 2021
FastIF: Scalable Influence Functions for Efficient Model Interpretation
  and Debugging
FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging
Han Guo
Nazneen Rajani
Peter Hase
Joey Tianyi Zhou
Caiming Xiong
TDI
79
112
0
31 Dec 2020
Removing Spurious Features can Hurt Accuracy and Affect Groups
  Disproportionately
Removing Spurious Features can Hurt Accuracy and Affect Groups Disproportionately
Fereshte Khani
Percy Liang
FaML
49
65
0
07 Dec 2020
Debugging Tests for Model Explanations
Debugging Tests for Model Explanations
Julius Adebayo
M. Muelly
Ilaria Liccardi
Been Kim
FAtt
64
181
0
10 Nov 2020
Understanding the Failure Modes of Out-of-Distribution Generalization
Understanding the Failure Modes of Out-of-Distribution Generalization
Vaishnavh Nagarajan
Anders Andreassen
Behnam Neyshabur
OOD
OODD
48
177
0
29 Oct 2020
Now You See Me (CME): Concept-based Model Extraction
Now You See Me (CME): Concept-based Model Extraction
Dmitry Kazhdan
B. Dimanov
M. Jamnik
Pietro Lio
Adrian Weller
46
75
0
25 Oct 2020
How Useful Are the Machine-Generated Interpretations to General Users? A
  Human Evaluation on Guessing the Incorrectly Predicted Labels
How Useful Are the Machine-Generated Interpretations to General Users? A Human Evaluation on Guessing the Incorrectly Predicted Labels
Hua Shen
Ting-Hao 'Kenneth' Huang
FAtt
HAI
54
56
0
26 Aug 2020
Are Visual Explanations Useful? A Case Study in Model-in-the-Loop
  Prediction
Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction
Eric Chu
D. Roy
Jacob Andreas
FAtt
LRM
49
71
0
23 Jul 2020
Debiasing Concept-based Explanations with Causal Analysis
Debiasing Concept-based Explanations with Causal Analysis
M. T. Bahadori
David Heckerman
FAtt
CML
55
39
0
22 Jul 2020
Fairwashing Explanations with Off-Manifold Detergent
Fairwashing Explanations with Off-Manifold Detergent
Christopher J. Anders
Plamen Pasliev
Ann-Kathrin Dombrowski
K. Müller
Pan Kessel
FAtt
FaML
42
97
0
20 Jul 2020
Concept Bottleneck Models
Concept Bottleneck Models
Pang Wei Koh
Thao Nguyen
Y. S. Tang
Stephen Mussmann
Emma Pierson
Been Kim
Percy Liang
94
818
0
09 Jul 2020
Influence Functions in Deep Learning Are Fragile
Influence Functions in Deep Learning Are Fragile
S. Basu
Phillip E. Pope
Soheil Feizi
TDI
99
230
0
25 Jun 2020
Noise or Signal: The Role of Image Backgrounds in Object Recognition
Noise or Signal: The Role of Image Backgrounds in Object Recognition
Kai Y. Xiao
Logan Engstrom
Andrew Ilyas
Aleksander Madry
131
387
0
17 Jun 2020
Explaining Black Box Predictions and Unveiling Data Artifacts through
  Influence Functions
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions
Xiaochuang Han
Byron C. Wallace
Yulia Tsvetkov
MILM
FAtt
AAML
TDI
55
171
0
14 May 2020
An Investigation of Why Overparameterization Exacerbates Spurious
  Correlations
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
Shiori Sagawa
Aditi Raghunathan
Pang Wei Koh
Percy Liang
182
379
0
09 May 2020
Shortcut Learning in Deep Neural Networks
Shortcut Learning in Deep Neural Networks
Robert Geirhos
J. Jacobsen
Claudio Michaelis
R. Zemel
Wieland Brendel
Matthias Bethge
Felix Wichmann
198
2,044
0
16 Apr 2020
Estimating Training Data Influence by Tracing Gradient Descent
Estimating Training Data Influence by Tracing Gradient Descent
G. Pruthi
Frederick Liu
Mukund Sundararajan
Satyen Kale
TDI
68
404
0
19 Feb 2020
Concept Whitening for Interpretable Image Recognition
Concept Whitening for Interpretable Image Recognition
Zhi Chen
Yijie Bei
Cynthia Rudin
FAtt
63
320
0
05 Feb 2020
Evaluating Saliency Map Explanations for Convolutional Neural Networks:
  A User Study
Evaluating Saliency Map Explanations for Convolutional Neural Networks: A User Study
Ahmed Alqaraawi
M. Schuessler
Philipp Weiß
Enrico Costanza
N. Bianchi-Berthouze
AAML
FAtt
XAI
61
200
0
03 Feb 2020
Sanity Checks for Saliency Metrics
Sanity Checks for Saliency Metrics
Richard J. Tomsett
Daniel Harborne
Supriyo Chakraborty
Prudhvi K. Gurram
Alun D. Preece
XAI
67
169
0
29 Nov 2019
Distributionally Robust Neural Networks for Group Shifts: On the
  Importance of Regularization for Worst-Case Generalization
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
85
1,236
0
20 Nov 2019
"How do I fool you?": Manipulating User Trust via Misleading Black Box
  Explanations
"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations
Himabindu Lakkaraju
Osbert Bastani
56
254
0
15 Nov 2019
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation
  Methods
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods
Dylan Slack
Sophie Hilgard
Emily Jia
Sameer Singh
Himabindu Lakkaraju
FAtt
AAML
MLAU
66
817
0
06 Nov 2019
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
On Completeness-aware Concept-Based Explanations in Deep Neural Networks
Chih-Kuan Yeh
Been Kim
Sercan O. Arik
Chun-Liang Li
Tomas Pfister
Pradeep Ravikumar
FAtt
210
304
0
17 Oct 2019
Interpretations are useful: penalizing explanations to align neural
  networks with prior knowledge
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
Laura Rieger
Chandan Singh
W. James Murdoch
Bin Yu
FAtt
78
214
0
30 Sep 2019
Improving performance of deep learning models with axiomatic attribution
  priors and expected gradients
Improving performance of deep learning models with axiomatic attribution priors and expected gradients
G. Erion
Joseph D. Janizek
Pascal Sturmfels
Scott M. Lundberg
Su-In Lee
OOD
BDL
FAtt
52
81
0
25 Jun 2019
Explanations can be manipulated and geometry is to blame
Explanations can be manipulated and geometry is to blame
Ann-Kathrin Dombrowski
Maximilian Alber
Christopher J. Anders
M. Ackermann
K. Müller
Pan Kessel
AAML
FAtt
78
330
0
19 Jun 2019
Unmasking Clever Hans Predictors and Assessing What Machines Really
  Learn
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn
Sebastian Lapuschkin
S. Wäldchen
Alexander Binder
G. Montavon
Wojciech Samek
K. Müller
84
1,009
0
26 Feb 2019
Transfusion: Understanding Transfer Learning for Medical Imaging
Transfusion: Understanding Transfer Learning for Medical Imaging
M. Raghu
Chiyuan Zhang
Jon M. Kleinberg
Samy Bengio
MedIm
75
982
0
14 Feb 2019
Fooling Neural Network Interpretations via Adversarial Model
  Manipulation
Fooling Neural Network Interpretations via Adversarial Model Manipulation
Juyeon Heo
Sunghwan Joo
Taesup Moon
AAML
FAtt
88
202
0
06 Feb 2019
ImageNet-trained CNNs are biased towards texture; increasing shape bias
  improves accuracy and robustness
ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
Robert Geirhos
Patricia Rubisch
Claudio Michaelis
Matthias Bethge
Felix Wichmann
Wieland Brendel
96
2,662
0
29 Nov 2018
Representer Point Selection for Explaining Deep Neural Networks
Representer Point Selection for Explaining Deep Neural Networks
Chih-Kuan Yeh
Joon Sik Kim
Ian En-Hsu Yen
Pradeep Ravikumar
TDI
64
251
0
23 Nov 2018
Deep Learning Predicts Hip Fracture using Confounding Patient and
  Healthcare Variables
Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables
Giovanni Sutanto
J. Zech
Luke Oakden-Rayner
Yevgen Chebotar
Manway Liu
William Gale
M. McConnell
Ankur Handa
Thomas M. Snyder
Dieter Fox
AI4CE
OOD
74
244
0
08 Nov 2018
Sanity Checks for Saliency Maps
Sanity Checks for Saliency Maps
Julius Adebayo
Justin Gilmer
M. Muelly
Ian Goodfellow
Moritz Hardt
Been Kim
FAtt
AAML
XAI
123
1,963
0
08 Oct 2018
A Benchmark for Interpretability Methods in Deep Neural Networks
A Benchmark for Interpretability Methods in Deep Neural Networks
Sara Hooker
D. Erhan
Pieter-Jan Kindermans
Been Kim
FAtt
UQCV
98
681
0
28 Jun 2018
A Theoretical Explanation for Perplexing Behaviors of
  Backpropagation-based Visualizations
A Theoretical Explanation for Perplexing Behaviors of Backpropagation-based Visualizations
Weili Nie
Yang Zhang
Ankit B. Patel
FAtt
120
151
0
18 May 2018
Manipulating and Measuring Model Interpretability
Manipulating and Measuring Model Interpretability
Forough Poursabzi-Sangdeh
D. Goldstein
Jake M. Hofman
Jennifer Wortman Vaughan
Hanna M. Wallach
84
697
0
21 Feb 2018
Interpretability Beyond Feature Attribution: Quantitative Testing with
  Concept Activation Vectors (TCAV)
Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV)
Been Kim
Martin Wattenberg
Justin Gilmer
Carrie J. Cai
James Wexler
F. Viégas
Rory Sayres
FAtt
199
1,837
0
30 Nov 2017
Interpretation of Neural Networks is Fragile
Interpretation of Neural Networks is Fragile
Amirata Ghorbani
Abubakar Abid
James Zou
FAtt
AAML
124
865
0
29 Oct 2017
SmoothGrad: removing noise by adding noise
SmoothGrad: removing noise by adding noise
D. Smilkov
Nikhil Thorat
Been Kim
F. Viégas
Martin Wattenberg
FAtt
ODL
199
2,221
0
12 Jun 2017
Network Dissection: Quantifying Interpretability of Deep Visual
  Representations
Network Dissection: Quantifying Interpretability of Deep Visual Representations
David Bau
Bolei Zhou
A. Khosla
A. Oliva
Antonio Torralba
MILM
FAtt
132
1,514
1
19 Apr 2017
12
Next