A Survey on Neural Network Interpretability

28 December 2020

Papers citing "A Survey on Neural Network Interpretability"

42 / 42 papers shown

Title
Tuning for Trustworthiness -- Balancing Performance and Explanation Consistency in Neural Network Optimization Alexander Hinterleitner Thomas Bartz-Beielstein 56 0 0 12 May 2025
Deriving Equivalent Symbol-Based Decision Models from Feedforward Neural Networks Sebastian Seidel Uwe M. Borghoff 53 1 0 16 Apr 2025
Minimum Description Length of a Spectrum Variational Autoencoder: A Theory Canlin Zhang Xiuwen Liu 78 0 0 01 Apr 2025
Axiomatic Explainer Globalness via Optimal Transport Davin Hill Josh Bone A. Masoomi Max Torop Jennifer Dy 138 1 0 13 Mar 2025
Explainable Neural Networks with Guarantees: A Sparse Estimation Approach Antoine Ledent Peng Liu FAtt 205 0 0 20 Feb 2025
Uncertainty-Aware Explanations Through Probabilistic Self-Explainable Neural Networks Jon Vadillo Roberto Santana J. A. Lozano Marta Z. Kwiatkowska BDL AAML 105 0 0 17 Feb 2025
Beyond Label Attention: Transparency in Language Models for Automated Medical Coding via Dictionary Learning John Wu David Wu Jimeng Sun 106 1 0 31 Oct 2024
Reinfier and Reintrainer: Verification and Interpretation-Driven Safe Deep Reinforcement Learning Frameworks Zixuan Yang Jiaqi Zheng Guihai Chen OffRL 55 0 0 19 Oct 2024
DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction John Wu David Wu Jimeng Sun 272 0 0 16 Sep 2024
Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction Melkamu Mersha Khang Lam Joseph Wood Ali AlShami Jugal Kalita XAI AI4TS 176 31 0 30 Aug 2024
Deep Learning without Global Optimization by Random Fourier Neural Networks Owen Davis Gianluca Geraci Mohammad Motamed BDL 72 0 0 16 Jul 2024
Retrievable Domain-Sensitive Feature Memory for Multi-Domain Recommendation Yuang Zhao Zhaocheng Du Qinglin Jia Linxuan Zhang Zhenhua Dong Ruiming Tang 104 3 0 21 May 2024
3VL: Using Trees to Improve Vision-Language Models' Interpretability Nir Yellinek Leonid Karlinsky Raja Giryes CoGe VLM 157 4 0 28 Dec 2023
Explaining Deep Convolutional Neural Networks for Image Classification by Evolving Local Interpretable Model-agnostic Explanations Bin Wang Wenbin Pei Bing Xue Mengjie Zhang FAtt 93 3 0 28 Nov 2022
Benchmarking and Survey of Explanation Methods for Black Box Models F. Bodria F. Giannotti Riccardo Guidotti Francesca Naretto D. Pedreschi S. Rinzivillo XAI 60 224 0 25 Feb 2021
Interpretable Machine Learning -- A Brief History, State-of-the-Art and Challenges Christoph Molnar Giuseppe Casalicchio B. Bischl AI4TS AI4CE 61 400 0 19 Oct 2020
Explanations can be manipulated and geometry is to blame Ann-Kathrin Dombrowski Maximilian Alber Christopher J. Anders M. Ackermann K. Müller Pan Kessel AAML FAtt 49 329 0 19 Jun 2019
Unmasking Clever Hans Predictors and Assessing What Machines Really Learn Sebastian Lapuschkin S. Wäldchen Alexander Binder G. Montavon Wojciech Samek K. Müller 61 1,005 0 26 Feb 2019
Fooling Neural Network Interpretations via Adversarial Model Manipulation Juyeon Heo Sunghwan Joo Taesup Moon AAML FAtt 68 201 0 06 Feb 2019
Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives Amit Dhurandhar Pin-Yu Chen Ronny Luss Chun-Chen Tu Pai-Shun Ting Karthikeyan Shanmugam Payel Das FAtt 89 587 0 21 Feb 2018
A Survey Of Methods For Explaining Black Box Models Riccardo Guidotti A. Monreale Salvatore Ruggieri Franco Turini D. Pedreschi F. Giannotti XAI 81 3,922 0 06 Feb 2018
Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks Ruth C. Fong Andrea Vedaldi FAtt 44 263 0 10 Jan 2018
NAG: Network for Adversary Generation Konda Reddy Mopuri Utkarsh Ojha Utsav Garg R. Venkatesh Babu AAML 53 144 0 09 Dec 2017
Beyond Sparsity: Tree Regularization of Deep Models for Interpretability Mike Wu M. C. Hughes S. Parbhoo Maurizio Zazzi Volker Roth Finale Doshi-Velez AI4CE 112 281 0 16 Nov 2017
Interpreting Deep Visual Representations via Network Dissection Bolei Zhou David Bau A. Oliva Antonio Torralba FAtt MILM 44 324 0 15 Nov 2017
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR Sandra Wachter Brent Mittelstadt Chris Russell MLAU 65 2,332 0 01 Nov 2017
Interpretable Explanations of Black Boxes by Meaningful Perturbation Ruth C. Fong Andrea Vedaldi FAtt AAML 45 1,514 0 11 Apr 2017
Learning Important Features Through Propagating Activation Differences Avanti Shrikumar Peyton Greenside A. Kundaje FAtt 110 3,848 0 10 Apr 2017
Understanding Black-box Predictions via Influence Functions Pang Wei Koh Percy Liang TDI 134 2,854 0 14 Mar 2017
Axiomatic Attribution for Deep Networks Mukund Sundararajan Ankur Taly Qiqi Yan OOD FAtt 108 5,920 0 04 Mar 2017
Universal adversarial perturbations Seyed-Mohsen Moosavi-Dezfooli Alhussein Fawzi Omar Fawzi P. Frossard AAML 108 2,520 0 26 Oct 2016
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network C. Ledig Lucas Theis Ferenc Huszár Jose Caballero Andrew Cunningham ... Andrew P. Aitken Alykhan Tejani J. Totz Zehan Wang Wenzhe Shi GAN 229 10,646 0 15 Sep 2016
The Mythos of Model Interpretability Zachary Chase Lipton FaML 110 3,672 0 10 Jun 2016
Learning Deep Features for Discriminative Localization Bolei Zhou A. Khosla Àgata Lapedriza A. Oliva Antonio Torralba SSL SSeg FAtt 150 9,266 0 14 Dec 2015
The Power of Depth for Feedforward Neural Networks Ronen Eldan Ohad Shamir 129 731 0 12 Dec 2015
Object Detectors Emerge in Deep Scene CNNs Bolei Zhou A. Khosla Àgata Lapedriza A. Oliva Antonio Torralba ObjD 110 1,279 0 22 Dec 2014
Striving for Simplicity: The All Convolutional Net Jost Tobias Springenberg Alexey Dosovitskiy Thomas Brox Martin Riedmiller FAtt 159 4,653 0 21 Dec 2014
Understanding Deep Image Representations by Inverting Them Aravindh Mahendran Andrea Vedaldi FAtt 84 1,959 0 26 Nov 2014
Intriguing properties of neural networks Christian Szegedy Wojciech Zaremba Ilya Sutskever Joan Bruna D. Erhan Ian Goodfellow Rob Fergus AAML 159 14,831 1 21 Dec 2013
Visualizing and Understanding Convolutional Networks Matthew D. Zeiler Rob Fergus FAtt SSL 258 15,825 0 12 Nov 2013
Distributed Representations of Words and Phrases and their Compositionality Tomas Mikolov Ilya Sutskever Kai Chen G. Corrado J. Dean NAI OCL 275 33,445 0 16 Oct 2013
Invariant Scattering Convolution Networks Joan Bruna S. Mallat 68 1,272 0 05 Mar 2012