Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.07624
Cited By
Attention Mechanisms in Computer Vision: A Survey
15 November 2021
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention Mechanisms in Computer Vision: A Survey"
50 / 158 papers shown
Title
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Jay Lee
Hanqi Su
Dai-Yan Ji
Takanobu Minami
AI4CE
94
0
0
03 Apr 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
236
52
0
21 Feb 2025
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Yongqi Dong
Xingmin Lu
Ruohan Li
Wei Song
B. Arem
Haneen Farah
ViT
139
1
0
21 Feb 2025
FSTA-SNN:Frequency-based Spatial-Temporal Attention Module for Spiking Neural Networks
Kairong Yu
Tianqing Zhang
Hongwei Wang
Qi Xu
71
2
0
15 Dec 2024
Does Self-Attention Need Separate Weights in Transformers?
Md. Kowsher
Nusrat Jahan Prottasha
Chun-Nam Yu
O. Garibay
Niloofar Yousefi
432
0
0
30 Nov 2024
Rethinking the Role of Infrastructure in Collaborative Perception
Hyunchul Bae
Minhee Kang
Minwoo Song
Heejin Ahn
296
0
0
15 Oct 2024
RICAU-Net: Residual-block Inspired Coordinate Attention U-Net for Segmentation of Small and Sparse Calcium Lesions in Cardiac CT
Doyoung Park
Jinsoo Kim
Qi Chang
S. Leng
Liang Zhong
L. Baskaran
57
1
0
11 Sep 2024
Graph Representation Learning via Causal Diffusion for Out-of-Distribution Recommendation
Chu Zhao
Enneng Yang
Yuliang Liang
Pengxiang Lan
Yuting Liu
Jianzhe Zhao
Guibing Guo
Xingwei Wang
OOD
DiffM
CML
154
7
0
01 Aug 2024
Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors
Peter Lorenz
Mario Fernandez
Jens Müller
Ullrich Kothe
AAML
102
1
0
21 Jun 2024
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression
Alireza Furutanpey
Qiyang Zhang
Philipp Raith
Tobias Pfandzelter
Shangguang Wang
Schahram Dustdar
113
4
0
25 Mar 2024
Cooperation Is All You Need
Ahsan Adeel
Junaid Muzaffar
K. Ahmed
Mohsin Raza
Mohsin Raza
Eamin Chaudary
Talha Bin Riaz
Ahmed Saeed
85
5
0
16 May 2023
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
51
655
0
20 Feb 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
Xiaohu Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
249
740
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
373
7,600
0
11 Nov 2021
Sampling Equivariant Self-attention Networks for Object Detection in Aerial Images
Guo-Ye Yang
Xiang-Li Li
Ralph Robert Martin
Shimin Hu
3DPC
36
13
0
05 Nov 2021
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
83
139
0
09 Sep 2021
FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting
R. Liu
Hanming Deng
Yangyi Huang
Xiaoyu Shi
Lewei Lu
Wenxiu Sun
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
63
126
0
07 Sep 2021
Query2Label: A Simple Transformer Way to Multi-Label Classification
Shilong Liu
Lei Zhang
Xiao Yang
Hang Su
Jun Zhu
55
189
0
22 Jul 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
80
322
0
24 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
175
2,790
0
15 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
91
1,188
0
09 Jun 2021
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Xiangning Chen
Cho-Jui Hsieh
Boqing Gong
ViT
70
324
0
03 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
156
4,934
0
31 May 2021
Can Attention Enable MLPs To Catch Up With CNNs?
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Dun Liang
Ralph Robert Martin
Shimin Hu
AAML
44
17
0
31 May 2021
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
68
657
0
07 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
51
480
0
05 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
381
2,638
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
587
5,920
0
29 Apr 2021
Decoupled Spatial-Temporal Transformer for Video Inpainting
R. Liu
Hanming Deng
Yangyi Huang
Xiaoyu Shi
Lewei Lu
Wenxiu Sun
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
55
46
0
14 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
129
1,837
0
05 Apr 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
119
998
0
31 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
114
1,891
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
327
21,175
0
25 Mar 2021
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
66
517
0
22 Mar 2021
Coordinate Attention for Efficient Mobile Network Design
Qibin Hou
Daquan Zhou
Jiashi Feng
57
2,992
0
04 Mar 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
362
1,544
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
450
3,678
0
24 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
99
1,918
0
28 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
250
2,463
0
04 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
127
2,872
0
31 Dec 2020
FcaNet: Frequency Channel Attention Networks
Zequn Qin
Pengyi Zhang
Leilei Gan
Xi Li
65
689
0
22 Dec 2020
PCT: Point cloud transformer
Meng-Hao Guo
Junxiong Cai
Zheng-Ning Liu
Tai-Jiang Mu
Ralph Robert Martin
Shimin Hu
ViT
3DPC
114
1,599
0
17 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
58
27
0
15 Dec 2020
Pre-Trained Image Processing Transformer
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLM
ViT
123
1,659
0
01 Dec 2020
Point Transformer
Nico Engel
Vasileios Belagiannis
Klaus C. J. Dietmayer
3DPC
147
1,972
0
02 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
400
40,217
0
22 Oct 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
164
4,993
0
08 Oct 2020
Rotate to Attend: Convolutional Triplet Attention Module
Diganta Misra
Trikay Nalamada
Ajay Uppili Arasanipalai
Qibin Hou
ViT
3DPC
44
586
0
06 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
164
1,323
0
03 Oct 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
144
1,548
0
30 Sep 2020
1
2
3
4
Next