ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.00928
  4. Cited By
Quantifying Attention Flow in Transformers

Quantifying Attention Flow in Transformers

2 May 2020
Samira Abnar
Willem H. Zuidema
ArXivPDFHTML

Papers citing "Quantifying Attention Flow in Transformers"

50 / 403 papers shown
Title
Incorporating Residual and Normalization Layers into Analysis of Masked
  Language Models
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
160
46
0
15 Sep 2021
Rationales for Sequential Predictions
Rationales for Sequential Predictions
Keyon Vafa
Yuntian Deng
David M. Blei
Alexander M. Rush
12
33
0
14 Sep 2021
The Grammar-Learning Trajectories of Neural Language Models
The Grammar-Learning Trajectories of Neural Language Models
Leshem Choshen
Guy Hacohen
D. Weinshall
Omri Abend
29
28
0
13 Sep 2021
When is Wall a Pared and when a Muro? -- Extracting Rules Governing
  Lexical Selection
When is Wall a Pared and when a Muro? -- Extracting Rules Governing Lexical Selection
Aditi Chaudhary
Kayo Yin
Antonios Anastasopoulos
Graham Neubig
AAML
17
3
0
13 Sep 2021
Global-Local Transformer for Brain Age Estimation
Global-Local Transformer for Brain Age Estimation
Sheng He
P. E. Grant
Yangming Ou
ViT
MedIm
25
105
0
03 Sep 2021
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Shifted Chunk Transformer for Spatio-Temporal Representational Learning
Xuefan Zha
Wentao Zhu
Tingxun Lv
Sen Yang
Ji Liu
AI4TS
ViT
33
27
0
26 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action
  Recognition
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
26
77
0
20 Aug 2021
Causal Attention for Unbiased Visual Recognition
Causal Attention for Unbiased Visual Recognition
Tan Wang
Chan Zhou
Qianru Sun
Hanwang Zhang
OOD
CML
32
108
0
19 Aug 2021
Post-hoc Interpretability for Neural NLP: A Survey
Post-hoc Interpretability for Neural NLP: A Survey
Andreas Madsen
Siva Reddy
A. Chandar
XAI
27
222
0
10 Aug 2021
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer
Yifan Xu
Zhijie Zhang
Mengdan Zhang
Kekai Sheng
Ke Li
Weiming Dong
Liqing Zhang
Changsheng Xu
Xing Sun
ViT
32
202
0
03 Aug 2021
Transformer-based deep imitation learning for dual-arm robot
  manipulation
Transformer-based deep imitation learning for dual-arm robot manipulation
Heecheol Kim
Y. Ohmura
Y. Kuniyoshi
26
48
0
01 Aug 2021
Revisiting Negation in Neural Machine Translation
Revisiting Negation in Neural Machine Translation
Gongbo Tang
Philipp Ronchen
Rico Sennrich
Joakim Nivre
14
7
0
26 Jul 2021
Transformer with Peak Suppression and Knowledge Guidance for
  Fine-grained Image Recognition
Transformer with Peak Suppression and Knowledge Guidance for Fine-grained Image Recognition
Xinda Liu
Lili Wang
Xiaoguang Han
ViT
41
66
0
14 Jul 2021
Attention Bottlenecks for Multimodal Fusion
Attention Bottlenecks for Multimodal Fusion
Arsha Nagrani
Shan Yang
Anurag Arnab
A. Jansen
Cordelia Schmid
Chen Sun
25
543
0
30 Jun 2021
ResViT: Residual vision transformers for multi-modal medical image
  synthesis
ResViT: Residual vision transformers for multi-modal medical image synthesis
Onat Dalmaz
Mahmut Yurt
Tolga Çukur
ViT
MedIm
32
338
0
30 Jun 2021
VEGN: Variant Effect Prediction with Graph Neural Networks
VEGN: Variant Effect Prediction with Graph Neural Networks
Jun Cheng
Carolin (Haas) Lawrence
Mathias Niepert
13
1
0
25 Jun 2021
Probing Inter-modality: Visual Parsing with Self-Attention for
  Vision-Language Pre-training
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Hongwei Xue
Yupan Huang
Bei Liu
Houwen Peng
Jianlong Fu
Houqiang Li
Jiebo Luo
30
88
0
25 Jun 2021
Exploring Vision Transformers for Fine-grained Classification
Exploring Vision Transformers for Fine-grained Classification
Marcos V. Conde
Kerem Turgutlu
ViT
13
17
0
19 Jun 2021
Relative Importance in Sentence Processing
Relative Importance in Sentence Processing
Nora Hollenstein
Lisa Beinborn
FAtt
16
29
0
07 Jun 2021
Anticipative Video Transformer
Anticipative Video Transformer
Rohit Girdhar
Kristen Grauman
ViT
27
207
0
03 Jun 2021
Attention Flows are Shapley Value Explanations
Attention Flows are Shapley Value Explanations
Kawin Ethayarajh
Dan Jurafsky
FAtt
TDI
24
34
0
31 May 2021
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and
  Interpretable Visual Understanding
Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding
Zizhao Zhang
Han Zhang
Long Zhao
Ting Chen
Sercan Ö. Arik
Tomas Pfister
ViT
22
169
0
26 May 2021
Vision Transformers are Robust Learners
Vision Transformers are Robust Learners
Sayak Paul
Pin-Yu Chen
ViT
28
307
0
17 May 2021
Episodic Transformer for Vision-and-Language Navigation
Episodic Transformer for Vision-and-Language Navigation
Alexander Pashevich
Cordelia Schmid
Chen Sun
LM&Ro
43
193
0
13 May 2021
Conformer: Local Features Coupling Global Representations for Visual
  Recognition
Conformer: Local Features Coupling Global Representations for Visual Recognition
Zhiliang Peng
Wei Huang
Shanzhi Gu
Lingxi Xie
Yaowei Wang
Jianbin Jiao
QiXiang Ye
ViT
21
527
0
09 May 2021
Inpainting Transformer for Anomaly Detection
Inpainting Transformer for Anomaly Detection
Jonathan Pirnay
K. Chai
ViT
107
165
0
28 Apr 2021
VidTr: Video Transformer Without Convolutions
VidTr: Video Transformer Without Convolutions
Yanyi Zhang
Xinyu Li
Chunhui Liu
Bing Shuai
Yi Zhu
Biagio Brattoli
Hao Chen
I. Marsic
Joseph Tighe
ViT
136
193
0
23 Apr 2021
Towards Human-Understandable Visual Explanations:Imperceptible
  High-frequency Cues Can Better Be Removed
Towards Human-Understandable Visual Explanations:Imperceptible High-frequency Cues Can Better Be Removed
Kaili Wang
José Oramas
Tinne Tuytelaars
AAML
22
2
0
16 Apr 2021
Towards BERT-based Automatic ICD Coding: Limitations and Opportunities
Towards BERT-based Automatic ICD Coding: Limitations and Opportunities
Damian Pascual
Sandro Luck
Roger Wattenhofer
MedIm
16
53
0
14 Apr 2021
Exploring the Role of BERT Token Representations to Explain Sentence
  Probing Results
Exploring the Role of BERT Token Representations to Explain Sentence Probing Results
Hosein Mohebbi
Ali Modarressi
Mohammad Taher Pilehvar
MILM
24
23
0
03 Apr 2021
TubeR: Tubelet Transformer for Video Action Detection
TubeR: Tubelet Transformer for Video Action Detection
Jiaojiao Zhao
Yanyi Zhang
Xinyu Li
Hao Chen
Shuai Bing
...
Yuanjun Xiong
Davide Modolo
I. Marsic
Cees G. M. Snoek
Joseph Tighe
ViT
36
70
0
02 Apr 2021
On the Robustness of Vision Transformers to Adversarial Examples
On the Robustness of Vision Transformers to Adversarial Examples
Kaleel Mahmood
Rigel Mahmood
Marten van Dijk
ViT
20
217
0
31 Mar 2021
Generic Attention-model Explainability for Interpreting Bi-Modal and
  Encoder-Decoder Transformers
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers
Hila Chefer
Shir Gur
Lior Wolf
ViT
28
303
0
29 Mar 2021
Face Transformer for Recognition
Face Transformer for Recognition
Yaoyao Zhong
Weihong Deng
ViT
19
70
0
27 Mar 2021
Interpretable Deep Learning: Interpretation, Interpretability,
  Trustworthiness, and Beyond
Interpretable Deep Learning: Interpretation, Interpretability, Trustworthiness, and Beyond
Xuhong Li
Haoyi Xiong
Xingjian Li
Xuanyu Wu
Xiao Zhang
Ji Liu
Jiang Bian
Dejing Dou
AAML
FaML
XAI
HAI
23
317
0
19 Mar 2021
TransFG: A Transformer Architecture for Fine-grained Recognition
TransFG: A Transformer Architecture for Fine-grained Recognition
Ju He
Jieneng Chen
Shuai Liu
Adam Kortylewski
Cheng Yang
Yutong Bai
Changhu Wang
ViT
37
375
0
14 Mar 2021
Towards Generalizable and Robust Face Manipulation Detection via
  Bag-of-local-feature
Towards Generalizable and Robust Face Manipulation Detection via Bag-of-local-feature
Changtao Miao
Qi Chu
Weihai Li
Tao Gong
Wanyi Zhuang
Nenghai Yu
AAML
CVBM
ViT
11
17
0
14 Mar 2021
OmniNet: Omnidirectional Representations from Transformers
OmniNet: Omnidirectional Representations from Transformers
Yi Tay
Mostafa Dehghani
V. Aribandi
Jai Gupta
Philip Pham
Zhen Qin
Dara Bahri
Da-Cheng Juan
Donald Metzler
47
26
0
01 Mar 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,984
0
09 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
On Explaining Your Explanations of BERT: An Empirical Study with
  Sequence Classification
On Explaining Your Explanations of BERT: An Empirical Study with Sequence Classification
Zhengxuan Wu
Desmond C. Ong
24
20
0
01 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
  with Transformers
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
42
2,841
0
31 Dec 2020
Transformer Interpretability Beyond Attention Visualization
Transformer Interpretability Beyond Attention Visualization
Hila Chefer
Shir Gur
Lior Wolf
45
644
0
17 Dec 2020
Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in
  Shared Representation Learning
Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning
Ricardo Guerrero
Hai Xuan Pham
Vladimir Pavlovic
6
23
0
02 Dec 2020
DoLFIn: Distributions over Latent Features for Interpretability
DoLFIn: Distributions over Latent Features for Interpretability
Phong Le
Willem H. Zuidema
FAtt
13
0
0
10 Nov 2020
Long Range Arena: A Benchmark for Efficient Transformers
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
47
693
0
08 Nov 2020
Influence Patterns for Explaining Information Flow in BERT
Influence Patterns for Explaining Information Flow in BERT
Kaiji Lu
Zifan Wang
Piotr (Peter) Mardziel
Anupam Datta
GNN
30
16
0
02 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
41
39,330
0
22 Oct 2020
The elephant in the interpretability room: Why use attention as
  explanation when we have saliency methods?
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
Jasmijn Bastings
Katja Filippova
XAI
LRM
43
173
0
12 Oct 2020
Structured Self-Attention Weights Encode Semantics in Sentiment Analysis
Structured Self-Attention Weights Encode Semantics in Sentiment Analysis
Zhengxuan Wu
Thanh-Son Nguyen
Desmond C. Ong
MILM
15
18
0
10 Oct 2020
Previous
123456789
Next