Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.11207
Cited By
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer
23 April 2020
Y. Hao
Li Dong
Furu Wei
Ke Xu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Self-Attention Attribution: Interpreting Information Interactions Inside Transformer"
34 / 34 papers shown
Title
ForeCite: Adapting Pre-Trained Language Models to Predict Future Citation Rates of Academic Papers
Gavin Hull
Alex Bihlo
29
0
0
13 May 2025
Hierarchical Attention Network for Interpretable ECG-based Heart Disease Classification
Mario Padilla Rodriguez
Mohamed Nafea
28
0
0
25 Mar 2025
Revealing and Mitigating Over-Attention in Knowledge Editing
Pinzheng Wang
Zecheng Tang
Keyan Zhou
J. Li
Qiaoming Zhu
Hao Fei
KELM
120
2
0
21 Feb 2025
Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers
Tobias Leemann
Alina Fastowski
Felix Pfeiffer
Gjergji Kasneci
62
4
0
10 Jan 2025
Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Yanwen Huang
Yong Zhang
Ning Cheng
Zhitao Li
Shaojun Wang
Jing Xiao
91
0
0
02 Jan 2025
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks
Samuele Poppi
Zheng-Xin Yong
Yifei He
Bobbie Chern
Han Zhao
Aobo Yang
Jianfeng Chi
AAML
53
15
0
23 Oct 2024
Repurposing Foundation Model for Generalizable Medical Time Series Classification
Nan Huang
Haishuai Wang
Zihuai He
Marinka Zitnik
Xiang Zhang
AI4TS
OOD
31
1
0
03 Oct 2024
SRViT: Vision Transformers for Estimating Radar Reflectivity from Satellite Observations at Scale
Jason Stock
Kyle Hilburn
Imme Ebert-Uphoff
Charles Anderson
40
1
0
20 Jun 2024
Are Human Conversations Special? A Large Language Model Perspective
Toshish Jawale
Chaitanya Animesh
Sekhar Vallath
Kartik Talamadupula
Larry Heck
38
2
0
08 Mar 2024
Interpreting and Exploiting Functional Specialization in Multi-Head Attention under Multi-task Learning
Chong Li
Shaonan Wang
Yunhao Zhang
Jiajun Zhang
Chengqing Zong
38
4
0
16 Oct 2023
Concise and Organized Perception Facilitates Reasoning in Large Language Models
Junjie Liu
Shaotian Yan
Chen Shen
Zhengdong Xiao
Wenxiao Wang
Jieping Ye
Jieping Ye
LRM
26
1
0
05 Oct 2023
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
85
7
0
14 Sep 2023
Instruction Position Matters in Sequence Generation with Large Language Models
Yanjun Liu
Xianfeng Zeng
Fandong Meng
Jie Zhou
LRM
54
8
0
23 Aug 2023
Causal Intersectionality and Dual Form of Gradient Descent for Multimodal Analysis: a Case Study on Hateful Memes
Yosuke Miyanishi
Minh Le Nguyen
34
2
0
19 Aug 2023
PMET: Precise Model Editing in a Transformer
Xiaopeng Li
Shasha Li
Shezheng Song
Jing Yang
Jun Ma
Jie Yu
KELM
34
119
0
17 Aug 2023
LEA: Improving Sentence Similarity Robustness to Typos Using Lexical Attention Bias
Mario Almagro
Emilio Almazán
Diego Ortego
David Jiménez
29
3
0
06 Jul 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurelien Lucchi
Thomas Hofmann
42
53
0
25 May 2023
How Does Attention Work in Vision Transformers? A Visual Analytics Attempt
Yiran Li
Junpeng Wang
Xin Dai
Liang Wang
Chin-Chia Michael Yeh
Yan Zheng
Wei Zhang
Kwan-Liu Ma
ViT
20
23
0
24 Mar 2023
Interpretability in Activation Space Analysis of Transformers: A Focused Survey
Soniya Vijayakumar
AI4CE
35
3
0
22 Jan 2023
Spatio-Temporal Attention in Multi-Granular Brain Chronnectomes for Detection of Autism Spectrum Disorder
James Orme-Rogers
Ajitesh Srivastava
6
3
0
30 Oct 2022
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling
Kalpa Gunaratna
Vijay Srinivasan
Akhila Yerukola
Hongxia Jin
29
6
0
19 Oct 2022
AD-DROP: Attribution-Driven Dropout for Robust Language Model Fine-Tuning
Tao Yang
Jinghao Deng
Xiaojun Quan
Qifan Wang
Shaoliang Nie
32
3
0
12 Oct 2022
TransPolymer: a Transformer-based language model for polymer property predictions
Changwen Xu
Yuyang Wang
A. Farimani
27
86
0
03 Sep 2022
FedMCSA: Personalized Federated Learning via Model Components Self-Attention
Qianling Guo
Yong Qi
Saiyu Qi
Di Wu
Qian Li
FedML
21
9
0
23 Aug 2022
What does Transformer learn about source code?
Kechi Zhang
Ge Li
Zhi Jin
ViT
28
8
0
18 Jul 2022
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm
Jiangning Zhang
Xiangtai Li
Yabiao Wang
Chengjie Wang
Yibo Yang
Yong Liu
Dacheng Tao
ViT
34
32
0
19 Jun 2022
Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Yunzhi Yao
Shaohan Huang
Li Dong
Furu Wei
Huajun Chen
Ningyu Zhang
KELM
MedIm
31
42
0
15 Jan 2022
Identifying and Mitigating Spurious Correlations for Improving Robustness in NLP Models
Tianlu Wang
Rohit Sridhar
Diyi Yang
Xuezhi Wang
AAML
120
72
0
14 Oct 2021
Adversarial Robustness Comparison of Vision Transformer and MLP-Mixer to CNNs
Philipp Benz
Soomin Ham
Chaoning Zhang
Adil Karjauv
In So Kweon
AAML
ViT
47
79
0
06 Oct 2021
Attributing Fair Decisions with Attention Interventions
Ninareh Mehrabi
Umang Gupta
Fred Morstatter
Greg Ver Steeg
Aram Galstyan
32
21
0
08 Sep 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
68
2,749
0
15 Jun 2021
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
28
417
0
18 Apr 2021
On the Robustness of Vision Transformers to Adversarial Examples
Kaleel Mahmood
Rigel Mahmood
Marten van Dijk
ViT
30
218
0
31 Mar 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1