v1v2 (latest)

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

16 January 2018

Papers citing "Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs"

50 / 125 papers shown

Title
From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling Mohsinul Kabir Tasfia Tahsin Sophia Ananiadou KELM AI4CE 68 0 0 18 May 2025
Can Input Attributions Explain Inductive Reasoning in In-Context Learning? Mengyu Ye Tatsuki Kuribayashi Goro Kobayashi Jun Suzuki LRM 164 0 0 20 Dec 2024
Towards More Trustworthy and Interpretable LLMs for Code through Syntax-Grounded Explanations David Nader-Palacio Daniel Rodríguez-Cárdenas Alejandro Velasco Dipin Khati Kevin Moran Denys Poshyvanyk 96 6 0 12 Jul 2024
KernelSHAP-IQ: Weighted Least-Square Optimization for Shapley Interactions Fabian Fumagalli Maximilian Muschalik Patrick Kolpaczki Eyke Hüllermeier Barbara Hammer 112 7 0 17 May 2024
Explaining with Contrastive Phrasal Highlighting: A Case Study in Assisting Humans to Detect Translation Differences Eleftheria Briakou Navita Goyal Marine Carpuat 89 3 0 04 Dec 2023
Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models Sean Xie Soroush Vosoughi Saeed Hassanpour 122 4 0 03 Nov 2023
On the Interplay between Fairness and Explainability Stephanie Brandl Emanuele Bugliarello Ilias Chalkidis FaML 99 5 0 25 Oct 2023
Uncovering hidden geometry in Transformers via disentangling position and context Jiajun Song Yiqiao Zhong 80 10 0 07 Oct 2023
Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation Sidney Bender Christopher J. Anders Pattarawat Chormai Heike Marxfeld J. Herrmann G. Montavon CML 67 2 0 02 Oct 2023
Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models Antoine Louis Gijs van Dijck Gerasimos Spanakis ELM AILaw 72 41 0 29 Sep 2023
Unveiling Vulnerabilities in Interpretable Deep Learning Systems with Query-Efficient Black-box Attacks Eldor Abdukhamidov Mohammed Abuhamad Simon S. Woo Eric Chan-Tin Tamer Abuhmed AAML 51 3 0 21 Jul 2023
Microbial Genetic Algorithm-based Black-box Attack against Interpretable Deep Learning Systems Eldor Abdukhamidov Mohammed Abuhamad Simon S. Woo Eric Chan-Tin Tamer Abuhmed AAML 57 1 0 13 Jul 2023
Single-Class Target-Specific Attack against Interpretable Deep Learning Systems Eldor Abdukhamidov Mohammed Abuhamad George K. Thiruvathukal Hyoungshick Kim Tamer Abuhmed AAML 57 2 0 12 Jul 2023
Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining Aaron J. Li Robin Netzorg Zhihan Cheng Zhuoqin Zhang Bin Yu 73 3 0 08 Jul 2023
Feature Interactions Reveal Linguistic Structure in Language Models Jaap Jumelet Willem H. Zuidema FAtt 61 7 0 21 Jun 2023
A semantically enhanced dual encoder for aspect sentiment triplet extraction Baoxing Jiang Shehui Liang Peiyu Liu Kaifang Dong Hongye Li 75 16 0 14 Jun 2023
DEGREE: Decomposition Based Explanation For Graph Neural Networks Qizhang Feng Ninghao Liu Fan Yang Ruixiang Tang Mengnan Du Helen Zhou 99 25 0 22 May 2023
Learning with Explanation Constraints Rattana Pukdee Dylan Sam J. Zico Kolter Maria-Florina Balcan Pradeep Ravikumar FAtt 105 6 0 25 Mar 2023
Reveal to Revise: An Explainable AI Life Cycle for Iterative Bias Correction of Deep Models Frederik Pahde Maximilian Dreyer Wojciech Samek Sebastian Lapuschkin 58 17 0 22 Mar 2023
SHAP-IQ: Unified Approximation of any-order Shapley Interactions Fabian Fumagalli Maximilian Muschalik Patrick Kolpaczki Eyke Hüllermeier Barbara Hammer 131 30 0 02 Mar 2023
Does a Neural Network Really Encode Symbolic Concepts? Mingjie Li Quanshi Zhang 98 31 0 25 Feb 2023
Improving Interpretability via Explicit Word Interaction Graph Layer Arshdeep Sekhon Hanjie Chen A. Shrivastava Zhe Wang Yangfeng Ji Yanjun Qi AI4CE MILM 73 6 0 03 Feb 2023
Relational Local Explanations V. Borisov Gjergji Kasneci FAtt 72 0 0 23 Dec 2022
Explainability of Text Processing and Retrieval Methods: A Critical Survey Sourav Saha Debapriyo Majumdar Mandar Mitra 98 5 0 14 Dec 2022
Generating Hierarchical Explanations on Text Classification Without Connecting Rules Yiming Ju Yuanzhe Zhang Kang Liu Jun Zhao FAtt 46 3 0 24 Oct 2022
Self-explaining deep models with logic rule reasoning Seungeon Lee Xiting Wang Sungwon Han Xiaoyuan Yi Xing Xie M. Cha NAI ReLM LRM 96 17 0 13 Oct 2022
Feature Importance for Time Series Data: Improving KernelSHAP M. Villani J. Lockhart Daniele Magazzeni FAtt AI4TS 69 7 0 05 Oct 2022
Power of Explanations: Towards automatic debiasing in hate speech detection Yitao Cai Arthur Zimek Gerhard Wunder Eirini Ntoutsi 73 6 0 07 Sep 2022
From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation Reduan Achtibat Maximilian Dreyer Ilona Eisenbraun S. Bosse Thomas Wiegand Wojciech Samek Sebastian Lapuschkin FAtt 87 150 0 07 Jun 2022
A Fine-grained Interpretability Evaluation Benchmark for Neural NLP Lijie Wang Yaozong Shen Shu-ping Peng Shuai Zhang Xinyan Xiao Hao Liu Hongxuan Tang Ying-Cong Chen Hua Wu Haifeng Wang ELM 104 22 0 23 May 2022
Implicit N-grams Induced by Recurrence Xiaobing Sun Wei Lu 51 3 0 05 May 2022
Interpretable Research Replication Prediction via Variational Contextual Consistency Sentence Masking Tianyi Luo Rui Meng Xinze Wang Yongxu Liu 59 4 0 28 Mar 2022
FaiRR: Faithful and Robust Deductive Reasoning over Natural Language Soumya Sanyal Harman Singh Xiang Ren ReLM LRM 106 46 0 19 Mar 2022
Beyond Explaining: Opportunities and Challenges of XAI-Based Model Improvement Leander Weber Sebastian Lapuschkin Alexander Binder Wojciech Samek 109 103 0 15 Mar 2022
Right for the Right Latent Factors: Debiasing Generative Models via Disentanglement Xiaoting Shao Karl Stelzner Kristian Kersting CML DRL 82 3 0 01 Feb 2022
Explainable Deep Learning in Healthcare: A Methodological Survey from an Attribution View Di Jin Elena Sergeeva W. Weng Geeticka Chauhan Peter Szolovits OOD 120 58 0 05 Dec 2021
Fast Axiomatic Attribution for Neural Networks Robin Hesse Simone Schaub-Meyer Stefan Roth 53 40 0 15 Nov 2021
Defining and Quantifying the Emergence of Sparse Concepts in DNNs Jie Ren Mingjie Li Qirui Chen Huiqi Deng Quanshi Zhang 135 33 0 11 Nov 2021
Machine Learning for Multimodal Electronic Health Records-based Research: Challenges and Perspectives Ziyi Liu Jiaqi Zhang Yongshuai Hou Xinran Zhang Ge Li Yang Xiang 101 14 0 09 Nov 2021
Interpreting Deep Learning Models in Natural Language Processing: A Review Xiaofei Sun Diyi Yang Xiaoya Li Tianwei Zhang Yuxian Meng Han Qiu Guoyin Wang Eduard H. Hovy Jiwei Li 99 47 0 20 Oct 2021
Logic Traps in Evaluating Attribution Scores Yiming Ju Yuanzhe Zhang Zhao Yang Zhongtao Jiang Kang Liu Jun Zhao XAI FAtt 119 19 0 12 Sep 2021
Beyond the Tip of the Iceberg: Assessing Coherence of Text Classifiers Shane Storks J. Chai 96 7 0 10 Sep 2021
Discretized Integrated Gradients for Explaining Language Models Soumya Sanyal Xiang Ren FAtt 68 54 0 31 Aug 2021
Neuron-level Interpretation of Deep NLP Models: A Survey Hassan Sajjad Nadir Durrani Fahim Dalvi MILM AI4CE 122 85 0 30 Aug 2021
Interpreting Attributions and Interactions of Adversarial Attacks Xin Eric Wang Shuyu Lin Hao Zhang Yufei Zhu Quanshi Zhang AAML FAtt 61 15 0 16 Aug 2021
Interpreting and improving deep-learning models with reality checks Chandan Singh Wooseok Ha Bin Yu FAtt 88 3 0 16 Aug 2021
Adaptive wavelet distillation from neural networks through interpretations Wooseok Ha Chandan Singh F. Lanusse Srigokul Upadhyayula Bin Yu 59 41 0 19 Jul 2021
Local Explanation of Dialogue Response Generation Yi-Lin Tuan Connor Pryor Wenhu Chen Lise Getoor Wenjie Wang 86 12 0 11 Jun 2021
Can We Faithfully Represent Masked States to Compute Shapley Values on a DNN? Jie Ren Zhanpeng Zhou Qirui Chen Quanshi Zhang FAtt TDI 84 8 0 22 May 2021
Attention vs non-attention for a Shapley-based explanation method T. Kersten Hugh Mee Wong Jaap Jumelet Dieuwke Hupkes 81 4 0 26 Apr 2021