Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1703.01365
Cited By
v1
v2 (latest)
Axiomatic Attribution for Deep Networks
4 March 2017
Mukund Sundararajan
Ankur Taly
Qiqi Yan
OOD
FAtt
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Axiomatic Attribution for Deep Networks"
50 / 2,871 papers shown
Title
From Concepts to Components: Concept-Agnostic Attention Module Discovery in Transformers
Jingtong Su
Julia Kempe
Karen Ullrich
16
0
0
20 Jun 2025
Anomaly Detection in Event-triggered Traffic Time Series via Similarity Learning
Shaoyu Dou
Kai Yang
Yang Jiao
Chengbo Qiu
Kui Ren
AI4TS
31
1
0
20 Jun 2025
A Hybrid DeBERTa and Gated Broad Learning System for Cyberbullying Detection in English Text
Devesh Kumar
17
0
0
19 Jun 2025
Pixel-level Certified Explanations via Randomized Smoothing
Alaa Anani
Tobias Lorenz
Mario Fritz
Bernt Schiele
FAtt
AAML
51
0
0
18 Jun 2025
TriGuard: Testing Model Safety with Attribution Entropy, Verification, and Drift
Dipesh Tharu Mahato
Rohan Poudel
Pramod Dhungana
AAML
30
0
0
17 Jun 2025
BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models
Bharath Dandala
Michael M. Danziger
Ella Barkan
Tanwi Biswas
Viatcheslav Gurev
...
Akira Koseki
Tal Kozlovski
Michal Rosen-Zvi
Yishai Shimoni
Ching-Huei Tsou
AI4CE
23
0
0
17 Jun 2025
Attribution-guided Pruning for Compression, Circuit Discovery, and Targeted Correction in LLMs
Sayed Mohammad Vakilzadeh Hatefi
Maximilian Dreyer
Reduan Achtibat
Patrick Kahardipraja
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
31
0
0
16 Jun 2025
Rethinking Explainability in the Era of Multimodal AI
Chirag Agarwal
29
0
0
16 Jun 2025
Scientifically-Interpretable Reasoning Network (ScIReN): Uncovering the Black-Box of Nature
Joshua Fan
Haodi Xu
Feng Tao
Md Nasim
Marc Grimson
Yiqi Luo
Carla P. Gomes
21
0
0
16 Jun 2025
Probing Deep into Temporal Profile Makes the Infrared Small Target Detector Much Better
Ruojing Li
Wei An
Xinyi Ying
Yingqian Wang
Yimian Dai
Longguang Wang
Miao Li
Y. Guo
Li Liu
29
0
0
15 Jun 2025
Addressing Bias in LLMs: Strategies and Application to Fair AI-based Recruitment
Alejandro Peña
Julian Fierrez
Aythami Morales
Gonzalo Mancera
Miguel Lopez
Ruben Tolosana
22
0
0
13 Jun 2025
Why Do Class-Dependent Evaluation Effects Occur with Time Series Feature Attributions? A Synthetic Data Investigation
Gregor Baer
Isel Grau
Chao Zhang
Pieter Van Gorp
12
0
0
13 Jun 2025
Empirical Quantification of Spurious Correlations in Malware Detection
Bianca Perasso
Ludovico Lozza
Andrea Ponte
Luca Demetrio
Luca Oneto
Fabio Roli
89
0
0
11 Jun 2025
Know-MRI: A Knowledge Mechanisms Revealer&Interpreter for Large Language Models
Jiaxiang Liu
Boxuan Xing
Chenhao Yuan
Chenxiang Zhang
Di Wu
...
Haida Yu
Chuhan Lang
Pengfei Cao
Jun Zhao
Kang Liu
18
0
0
10 Jun 2025
Enhancing Accuracy and Maintainability in Nuclear Plant Data Retrieval: A Function-Calling LLM Approach Over NL-to-SQL
Mishca de Costa
Muhammad Anwar
Dave Mercier
Mark Randall
Issam Hammad
26
0
0
10 Jun 2025
Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models
Milan Bhan
Jean-Noel Vittaut
Nicolas Chesneau
Sarath Chandar
Marie-Jeanne Lesot
LRM
31
0
0
10 Jun 2025
Explainable Compliance Detection with Multi-Hop Natural Language Inference on Assurance Case Structure
Fariz Ikhwantri
Dusica Marijan
25
0
0
10 Jun 2025
Towards Large Language Models with Self-Consistent Natural Language Explanations
Sahar Admoni
Ofra Amir
Assaf Hallak
Yftah Ziser
LRM
21
0
0
09 Jun 2025
Evaluating explainable AI for deep learning-based network intrusion detection system alert classification
Rajesh Kalakoti
Risto Vaarandi
Hayretdin Bahsi
Sven Nõmm
AAML
12
0
0
09 Jun 2025
Explaining Risks: Axiomatic Risk Attributions for Financial Models
Dangxing Chen
FAtt
20
0
0
07 Jun 2025
WISCA: A Consensus-Based Approach to Harmonizing Interpretability in Tabular Datasets
A. Banegas-Luna
Horacio Pérez-Sánchez
Carlos Martínez-Cortés
15
0
0
06 Jun 2025
Interpretation Meets Safety: A Survey on Interpretation Methods and Tools for Improving LLM Safety
Seongmin Lee
Aeree Cho
Grace C. Kim
ShengYun Peng
Mansi Phute
Duen Horng Chau
LM&MA
AI4CE
82
0
0
05 Jun 2025
TIMING: Temporality-Aware Integrated Gradients for Time Series Explanation
Hyeongwon Jang
Changhun Kim
Eunho Yang
AI4TS
125
0
0
05 Jun 2025
Benchmarking Time-localized Explanations for Audio Classification Models
Cecilia Bolaños
L. Pepino
Martin Meza
Luciana Ferrer
39
0
0
04 Jun 2025
Explainable AI: XAI-Guided Context-Aware Data Augmentation
Melkamu Mersha
M. Yigezu
A. Tonja
Hassan Shakil
Samer Iskander
Olga Kolesnikova
Jugal Kalita
188
0
0
04 Jun 2025
TracLLM: A Generic Framework for Attributing Long Context LLMs
Yanting Wang
Wei Zou
Runpeng Geng
Jinyuan Jia
LLMAG
126
0
0
04 Jun 2025
Identifying Alzheimer's Disease Prediction Strategies of Convolutional Neural Network Classifiers using R2* Maps and Spectral Clustering
C. Tinauer
Maximilian Sackl
Stefan Ropele
C. Langkammer
56
0
0
04 Jun 2025
Bridging Neural ODE and ResNet: A Formal Error Bound for Safety Verification
Abdelrahman Sayed Sayed
Pierre-Jean Meyer
Mohamed Ghazel
31
0
0
03 Jun 2025
XAI-Units: Benchmarking Explainability Methods with Unit Tests
Jun Rui Lee
Sadegh Emami
Michael David Hollins
Timothy C. H. Wong
Carlos Ignacio Villalobos Sánchez
Francesca Toni
Dekai Zhang
Adam Dejl
48
0
0
01 Jun 2025
Concept-Centric Token Interpretation for Vector-Quantized Generative Models
Tianze Yang
Yucheng Shi
Mengnan Du
Xuansheng Wu
Qiaoyu Tan
Jin Sun
Ninghao Liu
26
0
0
31 May 2025
Understanding Refusal in Language Models with Sparse Autoencoders
Wei Jie Yeo
Nirmalendu Prakash
Clement Neo
Roy Ka-wei Lee
Erik Cambria
Ranjan Satapathy
16
0
0
29 May 2025
X2Graph for Cancer Subtyping Prediction on Biological Tabular Data
Tu Bui
Mohamed Suliman
Aparajita Haldar
Mohammed Amer
Serban Georgescu
GNN
44
0
0
29 May 2025
Self-Critique and Refinement for Faithful Natural Language Explanations
Yingming Wang
Pepa Atanasova
LRM
126
0
0
28 May 2025
TensorShield: Safeguarding On-Device Inference by Shielding Critical DNN Tensors with TEE
Tong Sun
Bowen Jiang
Hailong Lin
Borui Li
Yixiao Teng
Yi Gao
Wei Dong
FedML
33
0
0
28 May 2025
Do you see what I see? An Ambiguous Optical Illusion Dataset exposing limitations of Explainable AI
Carina Newen
Luca Hinkamp
Maria Ntonti
Emmanuel Müller
120
0
0
27 May 2025
Relevance-driven Input Dropout: an Explanation-guided Regularization Technique
Shreyas Gururaj
Lars Grüne
Wojciech Samek
Sebastian Lapuschkin
Leander Weber
142
0
0
27 May 2025
Situationally-Aware Dynamics Learning
Alejandro Murillo-Gonzalez
Lantao Liu
123
0
0
26 May 2025
A Comprehensive Survey on the Risks and Limitations of Concept-based Models
Sanchit Sinha
Aidong Zhang
17
0
0
25 May 2025
Learning to Explain: Prototype-Based Surrogate Models for LLM Classification
Bowen Wei
Mehrdad Fazli
Ziwei Zhu
LLMAG
129
0
0
25 May 2025
Anchored Diffusion Language Model
Litu Rout
Constantine Caramanis
Sanjay Shakkottai
72
0
0
24 May 2025
ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs
Landon Butler
Abhineet Agarwal
Justin Singh Kang
Yigit Efe Erginbas
Bin Yu
Kannan Ramchandran
143
0
0
23 May 2025
Reverse-Speech-Finder: A Neural Network Backtracking Architecture for Generating Alzheimer's Disease Speech Samples and Improving Diagnosis Performance
Victor O.K. Li
Yang Han
Jacqueline C. K. Lam
Lawrence Y. L. Cheung
198
0
0
23 May 2025
GIM: Improved Interpretability for Large Language Models
Joakim Edin
Róbert Csordás
Tuukka Ruotsalo
Zhengxuan Wu
Maria Maistro
Jing-ling Huang
Lars Maaløe
124
0
0
23 May 2025
Soft-CAM: Making black box models self-explainable for high-stakes decisions
K. Djoumessi
Philipp Berens
FAtt
BDL
233
0
0
23 May 2025
The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation
Patrick Kahardipraja
Reduan Achtibat
Thomas Wiegand
Wojciech Samek
Sebastian Lapuschkin
151
0
0
21 May 2025
Do Language Models Use Their Depth Efficiently?
Róbert Csordás
Christopher D. Manning
Christopher Potts
210
2
0
20 May 2025
"Haet Bhasha aur Diskrimineshun": Phonetic Perturbations in Code-Mixed Hinglish to Red-Team LLMs
Darpan Aswal
Siddharth D Jaiswal
AAML
57
0
0
20 May 2025
Explainable Prediction of the Mechanical Properties of Composites with CNNs
Varun Raaghav
Dimitrios Bikos
Antonio Rago
Francesca Toni
Maria Charalambides
117
0
0
20 May 2025
Attributional Safety Failures in Large Language Models under Code-Mixed Perturbations
Somnath Banerjee
Pratyush Chatterjee
Shanu Kumar
Sayan Layek
Parag Agrawal
Rima Hazra
Animesh Mukherjee
AAML
200
0
0
20 May 2025
EPIC: Explanation of Pretrained Image Classification Networks via Prototype
Piotr Borycki
Magdalena Trędowicz
Szymon Janusz
Jacek Tabor
Przemysław Spurek
Arkadiusz Lewicki
Łukasz Struski
196
0
0
19 May 2025
1
2
3
4
...
56
57
58
Next