A Benchmark for Interpretability Methods in Deep Neural Networks

28 June 2018

Sara Hooker

D. Erhan

Pieter-Jan Kindermans

Papers citing "A Benchmark for Interpretability Methods in Deep Neural Networks"

50 / 143 papers shown

Title
ModelDiff: A Framework for Comparing Learning Algorithms Harshay Shah Sung Min Park Andrew Ilyas A. Madry SyDa 54 26 0 22 Nov 2022
Explaining Image Classifiers with Multiscale Directional Image Representation Stefan Kolek Robert Windesheim Héctor Andrade-Loarca Gitta Kutyniok Ron Levie 29 4 0 22 Nov 2022
Model free variable importance for high dimensional data Naofumi Hama Masayoshi Mase Art B. Owen 29 1 0 15 Nov 2022
Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods Josip Jukić Martin Tutek Jan Snajder FAtt 31 0 0 15 Nov 2022
New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound Arushi Gupta Nikunj Saunshi Dingli Yu Kaifeng Lyu Sanjeev Arora AAML FAtt XAI 31 5 0 05 Nov 2022
Exploring Self-Attention for Crop-type Classification Explainability Ivica Obadic R. Roscher Dario Augusto Borges Oliveira Xiao Xiang Zhu 30 7 0 24 Oct 2022
Computing Abductive Explanations for Boosted Trees Gilles Audemard Jean-Marie Lagniez Pierre Marquis N. Szczepanski 44 12 0 16 Sep 2022
Defending Against Backdoor Attack on Graph Nerual Network by Explainability B. Jiang Zhao Li AAML GNN 64 16 0 07 Sep 2022
ferret: a Framework for Benchmarking Explainers on Transformers Giuseppe Attanasio Eliana Pastor C. Bonaventura Debora Nozza 33 30 0 02 Aug 2022
Leveraging Explanations in Interactive Machine Learning: An Overview Stefano Teso Öznur Alkan Wolfgang Stammer Elizabeth M. Daly XAI FAtt LRM 26 62 0 29 Jul 2022
BASED-XAI: Breaking Ablation Studies Down for Explainable Artificial Intelligence Isha Hameed Samuel Sharpe Daniel Barcklow Justin Au-yeung Sahil Verma Jocelyn Huang Brian Barr C. Bayan Bruss 38 15 0 12 Jul 2022
FRAME: Evaluating Rationale-Label Consistency Metrics for Free-Text Rationales Aaron Chan Shaoliang Nie Liang Tan Xiaochang Peng Hamed Firooz Maziar Sanjabi Xiang Ren 55 9 0 02 Jul 2022
BAGEL: A Benchmark for Assessing Graph Neural Network Explanations Mandeep Rathee Thorben Funke Avishek Anand Megha Khosla 46 15 0 28 Jun 2022
Explanation-based Counterfactual Retraining(XCR): A Calibration Method for Black-box Models Liu Zhendong Wenyu Jiang Yan Zhang Chongjun Wang CML 11 0 0 22 Jun 2022
Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post Hoc Explanations Tessa Han Suraj Srinivas Himabindu Lakkaraju FAtt 50 86 0 02 Jun 2022
Attribution-based Explanations that Provide Recourse Cannot be Robust H. Fokkema R. D. Heide T. Erven FAtt 47 18 0 31 May 2022
How explainable are adversarially-robust CNNs? Mehdi Nourelahi Lars Kotthoff Peijie Chen Anh Totti Nguyen AAML FAtt 24 8 0 25 May 2022
Faithful Explanations for Deep Graph Models Zifan Wang Yuhang Yao Chaoran Zhang Han Zhang Youjie Kang Carlee Joe-Wong Matt Fredrikson Anupam Datta FAtt 24 2 0 24 May 2022
Necessity and Sufficiency for Explaining Text Classifiers: A Case Study in Hate Speech Detection Esma Balkir I. Nejadgholi Kathleen C. Fraser S. Kiritchenko FAtt 41 27 0 06 May 2022
Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees Jonathan Brophy Zayd Hammoudeh Daniel Lowd TDI 27 22 0 30 Apr 2022
ViTOL: Vision Transformer for Weakly Supervised Object Localization Saurav Gupta Sourav Lakhotia Abhay Rawat Rahul Tallamraju WSOL 36 21 0 14 Apr 2022
Maximum Entropy Baseline for Integrated Gradients Hanxiao Tan FAtt 24 4 0 12 Apr 2022
Analyzing the Effects of Handling Data Imbalance on Learned Features from Medical Images by Looking Into the Models Ashkan Khakzar Yawei Li Yang Zhang Mirac Sanisoglu Seong Tae Kim Mina Rezaei Bernd Bischl Nassir Navab 35 0 0 04 Apr 2022
Label-Free Explainability for Unsupervised Models Jonathan Crabbé M. Schaar FAtt MILM 24 22 0 03 Mar 2022
MUC-driven Feature Importance Measurement and Adversarial Analysis for Random Forest Shucen Ma Jianqi Shi Yanhong Huang Shengchao Qin Zhe Hou AAML 32 4 0 25 Feb 2022
Evaluating Feature Attribution Methods in the Image Domain Arne Gevaert Axel-Jan Rousseau Thijs Becker D. Valkenborg T. D. Bie Yvan Saeys FAtt 27 22 0 22 Feb 2022
Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis Thomas Fel Mélanie Ducoffe David Vigouroux Rémi Cadène Mikael Capelle C. Nicodeme Thomas Serre AAML 28 41 0 15 Feb 2022
Multi-Modal Knowledge Graph Construction and Application: A Survey Xiangru Zhu Zhixu Li Xiaodan Wang Xueyao Jiang Penglei Sun Xuwu Wang Yanghua Xiao N. Yuan 41 154 0 11 Feb 2022
The impact of feature importance methods on the interpretation of defect classifiers Gopi Krishnan Rajbahadur Shaowei Wang Yasutaka Kamei Ahmed E. Hassan FAtt 30 80 0 04 Feb 2022
Diagnosing AI Explanation Methods with Folk Concepts of Behavior Alon Jacovi Jasmijn Bastings Sebastian Gehrmann Yoav Goldberg Katja Filippova 36 15 0 27 Jan 2022
When less is more: Simplifying inputs aids neural network understanding R. Schirrmeister Rosanne Liu Sara Hooker T. Ball 27 5 0 14 Jan 2022
Topological Representations of Local Explanations Peter Xenopoulos G. Chan Harish Doraiswamy L. G. Nonato Brian Barr Claudio Silva FAtt 28 4 0 06 Jan 2022
Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations Siddhant Arora Danish Pruthi Norman M. Sadeh William W. Cohen Zachary Chase Lipton Graham Neubig FAtt 40 38 0 17 Dec 2021
UNIREX: A Unified Learning Framework for Language Model Rationale Extraction Aaron Chan Maziar Sanjabi Lambert Mathias L Tan Shaoliang Nie Xiaochang Peng Xiang Ren Hamed Firooz 43 42 0 16 Dec 2021
Evaluating saliency methods on artificial data with different background types Céline Budding Fabian Eitel K. Ritter Stefan Haufe XAI FAtt MedIm 29 5 0 09 Dec 2021
HIVE: Evaluating the Human Interpretability of Visual Explanations Sunnie S. Y. Kim Nicole Meister V. V. Ramaswamy Ruth C. Fong Olga Russakovsky 66 114 0 06 Dec 2021
Improving Deep Learning Interpretability by Saliency Guided Training Aya Abdelsalam Ismail H. C. Bravo S. Feizi FAtt 31 80 0 29 Nov 2021
Look at the Variance! Efficient Black-box Explanations with Sobol-based Sensitivity Analysis Thomas Fel Rémi Cadène Mathieu Chalvidal Matthieu Cord David Vigouroux Thomas Serre MLAU FAtt AAML 120 61 0 07 Nov 2021
Human Attention in Fine-grained Classification Yao Rong Wenjia Xu Zeynep Akata Enkelejda Kasneci 45 35 0 02 Nov 2021
Double Trouble: How to not explain a text classifier's decisions using counterfactuals synthesized by masked language models? Thang M. Pham Trung H. Bui Long Mai Anh Totti Nguyen 21 7 0 22 Oct 2021
Coalitional Bayesian Autoencoders -- Towards explainable unsupervised deep learning Bang Xiang Yong Alexandra Brintrup 23 6 0 19 Oct 2021
Evaluating the Faithfulness of Importance Measures in NLP by Recursively Masking Allegedly Important Tokens and Retraining Andreas Madsen Nicholas Meade Vaibhav Adlakha Siva Reddy 111 35 0 15 Oct 2021
Deep Synoptic Monte Carlo Planning in Reconnaissance Blind Chess Gregory Clark 35 9 0 05 Oct 2021
Discriminative Attribution from Counterfactuals N. Eckstein A. S. Bates G. Jefferis Jan Funke FAtt CML 27 1 0 28 Sep 2021
Time Series Model Attribution Visualizations as Explanations U. Schlegel Daniel A. Keim TDI BDL FAtt AI4TS XAI 48 15 0 27 Sep 2021
Longitudinal Distance: Towards Accountable Instance Attribution Rosina O. Weber Prateek Goel S. Amiri G. Simpson 16 0 0 23 Aug 2021
Temporal Dependencies in Feature Importance for Time Series Predictions Kin Kwan Leung Clayton Rooke Jonathan Smith S. Zuberi M. Volkovs OOD AI4TS 36 24 0 29 Jul 2021
Attribution of Predictive Uncertainties in Classification Models Iker Perez Piotr Skalski Alec E. Barns-Graham Jason Wong David Sutton UQCV 32 6 0 19 Jul 2021
Synthetic Benchmarks for Scientific Research in Explainable Machine Learning Yang Liu Sujay Khandagale Colin White Willie Neiswanger 37 65 0 23 Jun 2021
Keep CALM and Improve Visual Feature Attribution Jae Myung Kim Junsuk Choe Zeynep Akata Seong Joon Oh FAtt 350 20 0 15 Jun 2021