v1v2 (latest)

Axiomatic Attribution for Deep Networks

4 March 2017

Ankur Taly

Papers citing "Axiomatic Attribution for Deep Networks"

50 / 2,871 papers shown

Title
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers Jingtong Su Julia Kempe Karen Ullrich 16 0 0 20 Jun 2025
Anomaly Detection in Event-triggered Traffic Time Series via Similarity Learning Shaoyu Dou Kai Yang Yang Jiao Chengbo Qiu Kui Ren AI4TS 31 1 0 20 Jun 2025
A Hybrid DeBERTa and Gated Broad Learning System for Cyberbullying Detection in English Text Devesh Kumar 17 0 0 19 Jun 2025
Pixel-level Certified Explanations via Randomized Smoothing Alaa Anani Tobias Lorenz Mario Fritz Bernt Schiele FAtt AAML 51 0 0 18 Jun 2025
TriGuard: Testing Model Safety with Attribution Entropy, Verification, and Drift Dipesh Tharu Mahato Rohan Poudel Pramod Dhungana AAML 30 0 0 17 Jun 2025
BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models Bharath Dandala Michael M. Danziger Ella Barkan Tanwi Biswas Viatcheslav Gurev ... Akira Koseki Tal Kozlovski Michal Rosen-Zvi Yishai Shimoni Ching-Huei Tsou AI4CE 23 0 0 17 Jun 2025
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs Sayed Mohammad Vakilzadeh Hatefi Maximilian Dreyer Reduan Achtibat Patrick Kahardipraja Thomas Wiegand Wojciech Samek Sebastian Lapuschkin 31 0 0 16 Jun 2025
Rethinking Explainability in the Era of Multimodal AI Chirag Agarwal 29 0 0 16 Jun 2025
Scientifically-Interpretable Reasoning Network (ScIReN): Uncovering the Black-Box of Nature Joshua Fan Haodi Xu Feng Tao Md Nasim Marc Grimson Yiqi Luo Carla P. Gomes 21 0 0 16 Jun 2025
Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better Ruojing Li Wei An Xinyi Ying Yingqian Wang Yimian Dai Longguang Wang Miao Li Y. Guo Li Liu 29 0 0 15 Jun 2025
Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment Alejandro Peña Julian Fierrez Aythami Morales Gonzalo Mancera Miguel Lopez Ruben Tolosana 22 0 0 13 Jun 2025
Why Do Class-Dependent Evaluation Effects Occur with Time Series Feature Attributions? A Synthetic Data Investigation Gregor Baer Isel Grau Chao Zhang Pieter Van Gorp 12 0 0 13 Jun 2025
Empirical Quantification of Spurious Correlations in Malware Detection Bianca Perasso Ludovico Lozza Andrea Ponte Luca Demetrio Luca Oneto Fabio Roli 89 0 0 11 Jun 2025
Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models Jiaxiang Liu Boxuan Xing Chenhao Yuan Chenxiang Zhang Di Wu ... Haida Yu Chuhan Lang Pengfei Cao Jun Zhao Kang Liu 18 0 0 10 Jun 2025
Enhancing Accuracy and Maintainability in Nuclear Plant Data Retrieval: A Function-Calling LLM Approach Over NL-to-SQL Mishca de Costa Muhammad Anwar Dave Mercier Mark Randall Issam Hammad 26 0 0 10 Jun 2025
Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models Milan Bhan Jean-Noel Vittaut Nicolas Chesneau Sarath Chandar Marie-Jeanne Lesot LRM 31 0 0 10 Jun 2025
Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure Fariz Ikhwantri Dusica Marijan 25 0 0 10 Jun 2025
Towards Large Language Models with Self-Consistent Natural Language Explanations Sahar Admoni Ofra Amir Assaf Hallak Yftah Ziser LRM 21 0 0 09 Jun 2025
Evaluating explainable AI for deep learning-based network intrusion detection system alert classification Rajesh Kalakoti Risto Vaarandi Hayretdin Bahsi Sven Nõmm AAML 12 0 0 09 Jun 2025
Explaining Risks: Axiomatic Risk Attributions for Financial Models Dangxing Chen FAtt 20 0 0 07 Jun 2025
WISCA: A Consensus-Based Approach to Harmonizing Interpretability in Tabular Datasets A. Banegas-Luna Horacio Pérez-Sánchez Carlos Martínez-Cortés 15 0 0 06 Jun 2025
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety Seongmin Lee Aeree Cho Grace C. Kim ShengYun Peng Mansi Phute Duen Horng Chau LM&MA AI4CE 82 0 0 05 Jun 2025
TIMING: Temporality-Aware Integrated Gradients for Time Series Explanation Hyeongwon Jang Changhun Kim Eunho Yang AI4TS 125 0 0 05 Jun 2025
Benchmarking Time-localized Explanations for Audio Classification Models Cecilia Bolaños L. Pepino Martin Meza Luciana Ferrer 39 0 0 04 Jun 2025
Explainable AI: XAI-Guided Context-Aware Data Augmentation Melkamu Mersha M. Yigezu A. Tonja Hassan Shakil Samer Iskander Olga Kolesnikova Jugal Kalita 188 0 0 04 Jun 2025
TracLLM: A Generic Framework for Attributing Long Context LLMs Yanting Wang Wei Zou Runpeng Geng Jinyuan Jia LLMAG 126 0 0 04 Jun 2025
Identifying Alzheimer's Disease Prediction Strategies of Convolutional Neural Network Classifiers using R2* Maps and Spectral Clustering C. Tinauer Maximilian Sackl Stefan Ropele C. Langkammer 56 0 0 04 Jun 2025
Bridging Neural ODE and ResNet: A Formal Error Bound for Safety Verification Abdelrahman Sayed Sayed Pierre-Jean Meyer Mohamed Ghazel 31 0 0 03 Jun 2025
XAI-Units: Benchmarking Explainability Methods with Unit Tests Jun Rui Lee Sadegh Emami Michael David Hollins Timothy C. H. Wong Carlos Ignacio Villalobos Sánchez Francesca Toni Dekai Zhang Adam Dejl 48 0 0 01 Jun 2025
Concept-Centric Token Interpretation for Vector-Quantized Generative Models Tianze Yang Yucheng Shi Mengnan Du Xuansheng Wu Qiaoyu Tan Jin Sun Ninghao Liu 26 0 0 31 May 2025
Understanding Refusal in Language Models with Sparse Autoencoders Wei Jie Yeo Nirmalendu Prakash Clement Neo Roy Ka-wei Lee Erik Cambria Ranjan Satapathy 16 0 0 29 May 2025
X2Graph for Cancer Subtyping Prediction on Biological Tabular Data Tu Bui Mohamed Suliman Aparajita Haldar Mohammed Amer Serban Georgescu GNN 44 0 0 29 May 2025
Self-Critique and Refinement for Faithful Natural Language Explanations Yingming Wang Pepa Atanasova LRM 126 0 0 28 May 2025
TensorShield: Safeguarding On-Device Inference by Shielding Critical DNN Tensors with TEE Tong Sun Bowen Jiang Hailong Lin Borui Li Yixiao Teng Yi Gao Wei Dong FedML 33 0 0 28 May 2025
Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI Carina Newen Luca Hinkamp Maria Ntonti Emmanuel Müller 120 0 0 27 May 2025
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique Shreyas Gururaj Lars Grüne Wojciech Samek Sebastian Lapuschkin Leander Weber 142 0 0 27 May 2025
Situationally-Aware Dynamics Learning Alejandro Murillo-Gonzalez Lantao Liu 123 0 0 26 May 2025
A Comprehensive Survey on the Risks and Limitations of Concept-based Models Sanchit Sinha Aidong Zhang 17 0 0 25 May 2025
Learning to Explain: Prototype-Based Surrogate Models for LLM Classification Bowen Wei Mehrdad Fazli Ziwei Zhu LLMAG 129 0 0 25 May 2025
Anchored Diffusion Language Model Litu Rout Constantine Caramanis Sanjay Shakkottai 72 0 0 24 May 2025
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs Landon Butler Abhineet Agarwal Justin Singh Kang Yigit Efe Erginbas Bin Yu Kannan Ramchandran 143 0 0 23 May 2025
Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance Victor O.K. Li Yang Han Jacqueline C. K. Lam Lawrence Y. L. Cheung 198 0 0 23 May 2025
GIM: Improved Interpretability for Large Language Models Joakim Edin Róbert Csordás Tuukka Ruotsalo Zhengxuan Wu Maria Maistro Jing-ling Huang Lars Maaløe 124 0 0 23 May 2025
Soft-CAM: Making black box models self-explainable for high-stakes decisions K. Djoumessi Philipp Berens FAtt BDL 233 0 0 23 May 2025
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation Patrick Kahardipraja Reduan Achtibat Thomas Wiegand Wojciech Samek Sebastian Lapuschkin 151 0 0 21 May 2025
Do Language Models Use Their Depth Efficiently? Róbert Csordás Christopher D. Manning Christopher Potts 210 2 0 20 May 2025
"Haet Bhasha aur Diskrimineshun": Phonetic Perturbations in Code-Mixed Hinglish to Red-Team LLMs Darpan Aswal Siddharth D Jaiswal AAML 57 0 0 20 May 2025
Explainable Prediction of the Mechanical Properties of Composites with CNNs Varun Raaghav Dimitrios Bikos Antonio Rago Francesca Toni Maria Charalambides 117 0 0 20 May 2025
Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations Somnath Banerjee Pratyush Chatterjee Shanu Kumar Sayan Layek Parag Agrawal Rima Hazra Animesh Mukherjee AAML 200 0 0 20 May 2025
EPIC: Explanation of Pretrained Image Classification Networks via Prototype Piotr Borycki Magdalena Trędowicz Szymon Janusz Jacek Tabor Przemysław Spurek Arkadiusz Lewicki Łukasz Struski 196 0 0 19 May 2025