Interpretation of Neural Networks is Fragile

29 October 2017

Papers citing "Interpretation of Neural Networks is Fragile"

50 / 467 papers shown

Title
Best of both worlds: local and global explanations with human-understandable concepts Jessica Schrouff Sebastien Baur Shaobo Hou Diana Mincu Eric Loreaux Ralph Blanes James Wexler Alan Karthikesalingam Been Kim FAtt 26 27 0 16 Jun 2021
S-LIME: Stabilized-LIME for Model Explanation Zhengze Zhou Giles Hooker Fei Wang FAtt 27 86 0 15 Jun 2021
On the Lack of Robust Interpretability of Neural Text Classifiers Muhammad Bilal Zafar Michele Donini Dylan Slack Cédric Archambeau Sanjiv Ranjan Das K. Kenthapadi AAML 11 21 0 08 Jun 2021
3DB: A Framework for Debugging Computer Vision Models Guillaume Leclerc Hadi Salman Andrew Ilyas Sai H. Vemprala Logan Engstrom ... Pengchuan Zhang Shibani Santurkar Greg Yang Ashish Kapoor A. Madry 40 40 0 07 Jun 2021
Evaluating Local Explanations using White-box Models Amir Hossein Akhavan Rahnama Judith Butepage Pierre Geurts Henrik Bostrom FAtt 27 0 0 04 Jun 2021
DISSECT: Disentangled Simultaneous Explanations via Concept Traversals Asma Ghandeharioun Been Kim Chun-Liang Li Brendan Jou B. Eoff Rosalind W. Picard AAML 30 53 0 31 May 2021
The effectiveness of feature attribution methods and its correlation with automatic evaluation scores Giang Nguyen Daeyoung Kim Anh Totti Nguyen FAtt 21 86 0 31 May 2021
Drop Clause: Enhancing Performance, Interpretability and Robustness of the Tsetlin Machine Jivitesh Sharma Rohan Kumar Yadav Ole-Christoffer Granmo Lei Jiao VLM 26 12 0 30 May 2021
EDDA: Explanation-driven Data Augmentation to Improve Explanation Faithfulness Ruiwen Li Zhibo Zhang Jiani Li C. Trabelsi Scott Sanner Jongseong Jang Yeonjeong Jeong Dongsub Shim AAML 11 1 0 29 May 2021
Fooling Partial Dependence via Data Poisoning Hubert Baniecki Wojciech Kretowicz P. Biecek AAML 29 23 0 26 May 2021
Information-theoretic Evolution of Model Agnostic Global Explanations Sukriti Verma Nikaash Puri Piyush B. Gupta Balaji Krishnamurthy FAtt 29 0 0 14 May 2021
XAI Handbook: Towards a Unified Framework for Explainable AI Sebastián M. Palacio Adriano Lucieri Mohsin Munir Jörn Hees Sheraz Ahmed Andreas Dengel 25 32 0 14 May 2021
Leveraging Sparse Linear Layers for Debuggable Deep Networks Eric Wong Shibani Santurkar A. Madry FAtt 22 88 0 11 May 2021
Interpretable Semantic Photo Geolocation Jonas Theiner Eric Müller-Budack Ralph Ewerth 18 30 0 30 Apr 2021
Towards Adversarial Patch Analysis and Certified Defense against Crowd Counting Qiming Wu Zhikang Zou Pan Zhou Xiaoqing Ye Binghui Wang Ang Li AAML 19 4 0 22 Apr 2021
On the Sensitivity and Stability of Model Interpretations in NLP Fan Yin Zhouxing Shi Cho-Jui Hsieh Kai-Wei Chang FAtt 19 33 0 18 Apr 2021
Evaluating Saliency Methods for Neural Language Models Shuoyang Ding Philipp Koehn FAtt XAI 23 54 0 12 Apr 2021
A-FMI: Learning Attributions from Deep Networks via Feature Map Importance An Zhang Xiang Wang Chengfang Fang Jie Shi Tat-Seng Chua Zehua Chen FAtt 24 3 0 12 Apr 2021
Sparse Oblique Decision Trees: A Tool to Understand and Manipulate Neural Net Features Suryabhan Singh Hada Miguel Á. Carreira-Perpiñán Arman Zharmagambetov 15 17 0 07 Apr 2021
Neural Response Interpretation through the Lens of Critical Pathways Ashkan Khakzar Soroosh Baselizadeh Saurabh Khanduja Christian Rupprecht Seong Tae Kim Nassir Navab 29 32 0 31 Mar 2021
Building Reliable Explanations of Unreliable Neural Networks: Locally Smoothing Perspective of Model Interpretation Dohun Lim Hyeonseok Lee Sungchan Kim FAtt AAML 23 13 0 26 Mar 2021
ExAD: An Ensemble Approach for Explanation-based Adversarial Detection R. Vardhan Ninghao Liu Phakpoom Chinprutthiwong Weijie Fu Zhen Hu Xia Hu G. Gu AAML 26 4 0 22 Mar 2021
CACTUS: Detecting and Resolving Conflicts in Objective Functions Subhajit Das Alex Endert 19 0 0 13 Mar 2021
Human-Understandable Decision Making for Visual Recognition Xiaowei Zhou Jie Yin Ivor Tsang Chen Wang FAtt HAI 15 1 0 05 Mar 2021
Detecting Spurious Correlations with Sanity Tests for Artificial Intelligence Guided Radiology Systems U. Mahmood Robik Shrestha D. Bates L. Mannelli G. Corrias Y. Erdi Christopher Kanan 18 16 0 04 Mar 2021
Do Input Gradients Highlight Discriminative Features? Harshay Shah Prateek Jain Praneeth Netrapalli AAML FAtt 23 57 0 25 Feb 2021
Resilience of Bayesian Layer-Wise Explanations under Adversarial Attacks Ginevra Carbone G. Sanguinetti Luca Bortolussi FAtt AAML 21 4 0 22 Feb 2021
Towards the Unification and Robustness of Perturbation and Gradient Based Explanations Sushant Agarwal S. Jabbari Chirag Agarwal Sohini Upadhyay Zhiwei Steven Wu Himabindu Lakkaraju FAtt AAML 24 60 0 21 Feb 2021
The Mind's Eye: Visualizing Class-Agnostic Features of CNNs Alexandros Stergiou FAtt 11 3 0 29 Jan 2021
Better sampling in explanation methods can prevent dieselgate-like deception Domen Vreš Marko Robnik-Šikonja AAML 15 10 0 26 Jan 2021
Investigating the significance of adversarial attacks and their relation to interpretability for radar-based human activity recognition systems Utku Ozbulak Baptist Vandersmissen A. Jalalvand Ivo Couckuyt Arnout Van Messem W. D. Neve AAML 11 18 0 26 Jan 2021
Show or Suppress? Managing Input Uncertainty in Machine Learning Model Explanations Danding Wang Wencan Zhang Brian Y. Lim FAtt 24 22 0 23 Jan 2021
How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations Sérgio Jesus Catarina Belém Vladimir Balayan João Bento Pedro Saleiro P. Bizarro João Gama 136 120 0 21 Jan 2021
Towards interpreting ML-based automated malware detection models: a survey Yuzhou Lin Xiaolin Chang 12 7 0 15 Jan 2021
Explainability of deep vision-based autonomous driving systems: Review and challenges Éloi Zablocki H. Ben-younes P. Pérez Matthieu Cord XAI 48 170 0 13 Jan 2021
Enhanced Regularizers for Attributional Robustness A. Sarkar Anirban Sarkar V. Balasubramanian 19 16 0 28 Dec 2020
A Survey on Neural Network Interpretability Yu Zhang Peter Tiño A. Leonardis K. Tang FaML XAI 144 661 0 28 Dec 2020
To what extent do human explanations of model behavior align with actual model behavior? Grusha Prasad Yixin Nie Joey Tianyi Zhou Robin Jia Douwe Kiela Adina Williams 31 28 0 24 Dec 2020
Towards Robust Explanations for Deep Neural Networks Ann-Kathrin Dombrowski Christopher J. Anders K. Müller Pan Kessel FAtt 30 63 0 18 Dec 2020
Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals Zhao Wang A. Culotta CML OOD 20 98 0 18 Dec 2020
Debiased-CAM to mitigate image perturbations with faithful visual explanations of machine learning Wencan Zhang Mariella Dimiccoli Brian Y. Lim FAtt 24 18 0 10 Dec 2020
Understanding Interpretability by generalized distillation in Supervised Classification Adit Agarwal Dr. K.K. Shukla Arjan Kuijper Anirban Mukhopadhyay FaML FAtt 29 0 0 05 Dec 2020
LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering Weixin Liang Fei Niu Aishwarya N. Reganti Govind Thattai Gokhan Tur 34 17 0 21 Nov 2020
Backdoor Attacks on the DNN Interpretation System Shihong Fang A. Choromańska FAtt AAML 29 19 0 21 Nov 2020
One Explanation is Not Enough: Structured Attention Graphs for Image Classification Vivswan Shitole Li Fuxin Minsuk Kahng Prasad Tadepalli Alan Fern FAtt GNN 22 38 0 13 Nov 2020
Robust and Stable Black Box Explanations Himabindu Lakkaraju Nino Arsov Osbert Bastani AAML FAtt 24 84 0 12 Nov 2020
Debugging Tests for Model Explanations Julius Adebayo M. Muelly Ilaria Liccardi Been Kim FAtt 19 177 0 10 Nov 2020
Benchmarking Deep Learning Interpretability in Time Series Predictions Aya Abdelsalam Ismail Mohamed K. Gunady H. C. Bravo S. Feizi XAI AI4TS FAtt 22 168 0 26 Oct 2020
Measuring Association Between Labels and Free-Text Rationales Sarah Wiegreffe Ana Marasović Noah A. Smith 282 170 0 24 Oct 2020
Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization Judy Borowski Roland S. Zimmermann Judith Schepers Robert Geirhos Thomas S. A. Wallis Matthias Bethge Wieland Brendel FAtt 42 7 0 23 Oct 2020