Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.00786
Cited By
The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations
1 June 2021
Peter Hase
Harry Xie
Joey Tianyi Zhou
OODD
LRM
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations"
50 / 64 papers shown
Title
Explanations as Bias Detectors: A Critical Study of Local Post-hoc XAI Methods for Fairness Exploration
Vasiliki Papanikou
Danae Pla Karidi
E. Pitoura
Emmanouil Panagiotou
Eirini Ntoutsi
33
0
0
01 May 2025
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin
Anton Xue
Weiqiu You
Surbhi Goel
Eric Wong
27
0
0
18 Apr 2025
PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition
Jongseo Lee
Wooil Lee
Gyeong-Moon Park
Seong Tae Kim
Jinwoo Choi
35
0
0
17 Apr 2025
Towards Spatially-Aware and Optimally Faithful Concept-Based Explanations
Shubham Kumar
Dwip Dalal
Narendra Ahuja
26
0
0
15 Apr 2025
Noiser: Bounded Input Perturbations for Attributing Large Language Models
Mohammad Reza Ghasemi Madani
Aryo Pradipta Gema
Gabriele Sarti
Yu Zhao
Pasquale Minervini
Andrea Passerini
AAML
35
0
0
03 Apr 2025
Self-Explaining Neural Networks for Business Process Monitoring
Shahaf Bassan
Shlomit Gur
Sergey Zeltyn
Konstantinos Mavrogiorgos
Ron Eliav
Dimosthenis Kyriazis
49
0
0
23 Mar 2025
Are formal and functional linguistic mechanisms dissociated in language models?
Michael Hanna
Sandro Pezzelle
Yonatan Belinkov
50
0
0
14 Mar 2025
A Tale of Two Imperatives: Privacy and Explainability
Supriya Manna
Niladri Sett
142
0
0
30 Dec 2024
From Flexibility to Manipulation: The Slippery Slope of XAI Evaluation
Kristoffer Wickstrøm
Marina M.-C. Höhne
Anna Hedström
AAML
84
2
0
07 Dec 2024
Benchmarking XAI Explanations with Human-Aligned Evaluations
Rémi Kazmierczak
Steve Azzolin
Eloise Berthier
Anna Hedström
Patricia Delhomme
...
Goran Frehse
Massimiliano Mancini
Baptiste Caramiaux
Andrea Passerini
Gianni Franchi
28
1
0
04 Nov 2024
Towards Faithful Natural Language Explanations: A Study Using Activation Patching in Large Language Models
Wei Jie Yeo
Ranjan Satapathy
Erik Cambria
34
0
0
18 Oct 2024
F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI
Xu Zheng
Farhad Shirani
Zhuomin Chen
Chaohao Lin
Wei Cheng
Wenbo Guo
Dongsheng Luo
AAML
38
0
0
03 Oct 2024
Faithfulness and the Notion of Adversarial Sensitivity in NLP Explanations
Supriya Manna
Niladri Sett
AAML
29
2
0
26 Sep 2024
Optimal ablation for interpretability
Maximilian Li
Lucas Janson
FAtt
49
2
0
16 Sep 2024
Counterfactuals As a Means for Evaluating Faithfulness of Attribution Methods in Autoregressive Language Models
Sepehr Kamahi
Yadollah Yaghoobzadeh
53
0
0
21 Aug 2024
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
40
3
0
15 Aug 2024
Hard to Explain: On the Computational Hardness of In-Distribution Model Interpretation
Guy Amir
Shahaf Bassan
Guy Katz
44
2
0
07 Aug 2024
BEExAI: Benchmark to Evaluate Explainable AI
Samuel Sithakoul
Sara Meftah
Clément Feutry
47
8
0
29 Jul 2024
Benchmarking the Attribution Quality of Vision Models
Robin Hesse
Simone Schaub-Meyer
Stefan Roth
FAtt
34
3
0
16 Jul 2024
Noise-Free Explanation for Driving Action Prediction
Hongbo Zhu
Theodor Wulff
R. S. Maharjan
Jinpei Han
Angelo Cangelosi
AAML
FAtt
38
0
0
08 Jul 2024
Efficient and Accurate Explanation Estimation with Distribution Compression
Hubert Baniecki
Giuseppe Casalicchio
Bernd Bischl
Przemyslaw Biecek
FAtt
48
3
0
26 Jun 2024
How to use and interpret activation patching
Stefan Heimersheim
Neel Nanda
38
37
0
23 Apr 2024
CA-Stream: Attention-based pooling for interpretable image recognition
Felipe Torres
Hanwei Zhang
R. Sicre
Stéphane Ayache
Yannis Avrithis
52
0
0
23 Apr 2024
On the Faithfulness of Vision Transformer Explanations
Junyi Wu
Weitai Kang
Hao Tang
Yuan Hong
Yan Yan
27
6
0
01 Apr 2024
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Gianluigi Lopardo
F. Precioso
Damien Garreau
16
4
0
05 Feb 2024
ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models
Zhixue Zhao
Boxuan Shan
28
5
0
01 Feb 2024
Explaining Time Series via Contrastive and Locally Sparse Perturbations
Zichuan Liu
Yingying Zhang
Tianchun Wang
Zefan Wang
Dongsheng Luo
...
Min Wu
Yi Wang
Chunlin Chen
Lunting Fan
Qingsong Wen
30
10
0
16 Jan 2024
Decoupling Pixel Flipping and Occlusion Strategy for Consistent XAI Benchmarks
Stefan Blücher
Johanna Vielhaben
Nils Strodthoff
AAML
66
20
0
12 Jan 2024
The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?
T. Besiroglu
S. Bergerson
Amelia Michael
Lennart Heim
Xueyun Luo
Neil Thompson
30
11
0
04 Jan 2024
Faithful and Robust Local Interpretability for Textual Predictions
Gianluigi Lopardo
F. Precioso
Damien Garreau
OOD
26
4
0
30 Oct 2023
Faithfulness Measurable Masked Language Models
Andreas Madsen
Siva Reddy
Sarath Chandar
40
3
0
11 Oct 2023
Evaluating Explanation Methods for Vision-and-Language Navigation
Guanqi Chen
Lei Yang
Guanhua Chen
Jia Pan
XAI
23
0
0
10 Oct 2023
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks
Xu Zheng
Farhad Shirani
Tianchun Wang
Wei Cheng
Zhuomin Chen
Haifeng Chen
Hua Wei
Dongsheng Luo
38
11
0
03 Oct 2023
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang
Neel Nanda
LLMSV
36
100
0
27 Sep 2023
Goodhart's Law Applies to NLP's Explanation Benchmarks
Jennifer Hsia
Danish Pruthi
Aarti Singh
Zachary Chase Lipton
30
6
0
28 Aug 2023
FunnyBirds: A Synthetic Vision Dataset for a Part-Based Analysis of Explainable AI Methods
Robin Hesse
Simone Schaub-Meyer
Stefan Roth
AAML
37
33
0
11 Aug 2023
Decoding Layer Saliency in Language Transformers
Elizabeth M. Hou
Greg Castañón
MILM
6
0
0
09 Aug 2023
The Co-12 Recipe for Evaluating Interpretable Part-Prototype Image Classifiers
Meike Nauta
Christin Seifert
34
11
0
26 Jul 2023
Encoding Time-Series Explanations through Self-Supervised Model Behavior Consistency
Owen Queen
Thomas Hartvigsen
Teddy Koker
Huan He
Theodoros Tsiligkaridis
Marinka Zitnik
AI4TS
37
17
0
03 Jun 2023
Efficient Shapley Values Estimation by Amortization for Text Classification
Chenghao Yang
Fan Yin
He He
Kai-Wei Chang
Xiaofei Ma
Bing Xiang
FAtt
VLM
35
4
0
31 May 2023
On the Impact of Knowledge Distillation for Model Interpretability
Hyeongrok Han
Siwon Kim
Hyun-Soo Choi
Sungroh Yoon
24
4
0
25 May 2023
Understanding Post-hoc Explainers: The Case of Anchors
Gianluigi Lopardo
F. Precioso
Damien Garreau
FAtt
22
2
0
15 Mar 2023
Relational Local Explanations
V. Borisov
Gjergji Kasneci
FAtt
22
0
0
23 Dec 2022
CRAFT: Concept Recursive Activation FacTorization for Explainability
Thomas Fel
Agustin Picard
Louis Bethune
Thibaut Boissin
David Vigouroux
Julien Colin
Rémi Cadène
Thomas Serre
19
103
0
17 Nov 2022
What Makes a Good Explanation?: A Harmonized View of Properties of Explanations
Zixi Chen
Varshini Subhash
Marton Havasi
Weiwei Pan
Finale Doshi-Velez
XAI
FAtt
36
18
0
10 Nov 2022
Learning Unsupervised Hierarchies of Audio Concepts
Darius Afchar
Romain Hennequin
Vincent Guigue
43
2
0
21 Jul 2022
BASED-XAI: Breaking Ablation Studies Down for Explainable Artificial Intelligence
Isha Hameed
Samuel Sharpe
Daniel Barcklow
Justin Au-yeung
Sahil Verma
Jocelyn Huang
Brian Barr
C. Bayan Bruss
38
15
0
12 Jul 2022
TalkToModel: Explaining Machine Learning Models with Interactive Natural Language Conversations
Dylan Slack
Satyapriya Krishna
Himabindu Lakkaraju
Sameer Singh
29
74
0
08 Jul 2022
VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives
Zhuofan Ying
Peter Hase
Joey Tianyi Zhou
LRM
28
13
0
22 Jun 2022
Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models
Liu Zhendong
Wenyu Jiang
Yan Zhang
Chongjun Wang
CML
11
0
0
22 Jun 2022
1
2
Next