ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.07624
  4. Cited By
Attention Mechanisms in Computer Vision: A Survey

Attention Mechanisms in Computer Vision: A Survey

15 November 2021
Meng-Hao Guo
Tianhan Xu
Jiangjiang Liu
Zheng-Ning Liu
Peng-Tao Jiang
Tai-Jiang Mu
Song-Hai Zhang
Ralph Robert Martin
Ming-Ming Cheng
Shimin Hu
ArXivPDFHTML

Papers citing "Attention Mechanisms in Computer Vision: A Survey"

50 / 158 papers shown
Title
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Engineering Artificial Intelligence: Framework, Challenges, and Future Direction
Jay Lee
Hanqi Su
Dai-Yan Ji
Takanobu Minami
AI4CE
94
0
0
03 Apr 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
236
52
0
21 Feb 2025
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Yongqi Dong
Xingmin Lu
Ruohan Li
Wei Song
B. Arem
Haneen Farah
ViT
139
1
0
21 Feb 2025
FSTA-SNN:Frequency-based Spatial-Temporal Attention Module for Spiking Neural Networks
FSTA-SNN:Frequency-based Spatial-Temporal Attention Module for Spiking Neural Networks
Kairong Yu
Tianqing Zhang
Hongwei Wang
Qi Xu
71
2
0
15 Dec 2024
Does Self-Attention Need Separate Weights in Transformers?
Md. Kowsher
Nusrat Jahan Prottasha
Chun-Nam Yu
O. Garibay
Niloofar Yousefi
432
0
0
30 Nov 2024
Rethinking the Role of Infrastructure in Collaborative Perception
Rethinking the Role of Infrastructure in Collaborative Perception
Hyunchul Bae
Minhee Kang
Minwoo Song
Heejin Ahn
296
0
0
15 Oct 2024
RICAU-Net: Residual-block Inspired Coordinate Attention U-Net for Segmentation of Small and Sparse Calcium Lesions in Cardiac CT
RICAU-Net: Residual-block Inspired Coordinate Attention U-Net for Segmentation of Small and Sparse Calcium Lesions in Cardiac CT
Doyoung Park
Jinsoo Kim
Qi Chang
S. Leng
Liang Zhong
L. Baskaran
57
1
0
11 Sep 2024
Graph Representation Learning via Causal Diffusion for Out-of-Distribution Recommendation
Graph Representation Learning via Causal Diffusion for Out-of-Distribution Recommendation
Chu Zhao
Enneng Yang
Yuliang Liang
Pengxiang Lan
Yuting Liu
Jianzhe Zhao
Guibing Guo
Xingwei Wang
OOD
DiffM
CML
154
7
0
01 Aug 2024
Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors
Deciphering the Definition of Adversarial Robustness for post-hoc OOD Detectors
Peter Lorenz
Mario Fernandez
Jens Müller
Ullrich Kothe
AAML
102
1
0
21 Jun 2024
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression
FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression
Alireza Furutanpey
Qiyang Zhang
Philipp Raith
Tobias Pfandzelter
Shangguang Wang
Schahram Dustdar
113
4
0
25 Mar 2024
Cooperation Is All You Need
Cooperation Is All You Need
Ahsan Adeel
Junaid Muzaffar
K. Ahmed
Mohsin Raza
Mohsin Raza
Eamin Chaudary
Talha Bin Riaz
Ahmed Saeed
85
5
0
16 May 2023
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
51
655
0
20 Feb 2022
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
Shilong Liu
Feng Li
Hao Zhang
Xiaohu Yang
Xianbiao Qi
Hang Su
Jun Zhu
Lei Zhang
ViT
249
740
0
28 Jan 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
373
7,600
0
11 Nov 2021
Sampling Equivariant Self-attention Networks for Object Detection in
  Aerial Images
Sampling Equivariant Self-attention Networks for Object Detection in Aerial Images
Guo-Ye Yang
Xiang-Li Li
Ralph Robert Martin
Shimin Hu
3DPC
36
13
0
05 Nov 2021
Is Attention Better Than Matrix Decomposition?
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
83
139
0
09 Sep 2021
FuseFormer: Fusing Fine-Grained Information in Transformers for Video
  Inpainting
FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting
R. Liu
Hanming Deng
Yangyi Huang
Xiaoyu Shi
Lewei Lu
Wenxiu Sun
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
63
126
0
07 Sep 2021
Query2Label: A Simple Transformer Way to Multi-Label Classification
Query2Label: A Simple Transformer Way to Multi-Label Classification
Shilong Liu
Lei Zhang
Xiao Yang
Hang Su
Jun Zhu
55
189
0
22 Jul 2021
VOLO: Vision Outlooker for Visual Recognition
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
80
322
0
24 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
175
2,790
0
15 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
91
1,188
0
09 Jun 2021
When Vision Transformers Outperform ResNets without Pre-training or
  Strong Data Augmentations
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Xiangning Chen
Cho-Jui Hsieh
Boqing Gong
ViT
70
324
0
03 Jun 2021
SegFormer: Simple and Efficient Design for Semantic Segmentation with
  Transformers
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers
Enze Xie
Wenhai Wang
Zhiding Yu
Anima Anandkumar
J. Álvarez
Ping Luo
ViT
156
4,934
0
31 May 2021
Can Attention Enable MLPs To Catch Up With CNNs?
Can Attention Enable MLPs To Catch Up With CNNs?
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Dun Liang
Ralph Robert Martin
Shimin Hu
AAML
44
17
0
31 May 2021
ResMLP: Feedforward networks for image classification with
  data-efficient training
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
68
657
0
07 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for
  Visual Tasks
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
51
480
0
05 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
381
2,638
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
587
5,920
0
29 Apr 2021
Decoupled Spatial-Temporal Transformer for Video Inpainting
Decoupled Spatial-Temporal Transformer for Video Inpainting
R. Liu
Hanming Deng
Yangyi Huang
Xiaoyu Shi
Lewei Lu
Wenxiu Sun
Xiaogang Wang
Jifeng Dai
Hongsheng Li
ViT
55
46
0
14 Apr 2021
An Empirical Study of Training Self-Supervised Vision Transformers
An Empirical Study of Training Self-Supervised Vision Transformers
Xinlei Chen
Saining Xie
Kaiming He
ViT
129
1,837
0
05 Apr 2021
Going deeper with Image Transformers
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
119
998
0
31 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
114
1,891
0
29 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
327
21,175
0
25 Mar 2021
DeepViT: Towards Deeper Vision Transformer
DeepViT: Towards Deeper Vision Transformer
Daquan Zhou
Bingyi Kang
Xiaojie Jin
Linjie Yang
Xiaochen Lian
Zihang Jiang
Qibin Hou
Jiashi Feng
ViT
66
517
0
22 Mar 2021
Coordinate Attention for Efficient Mobile Network Design
Coordinate Attention for Efficient Mobile Network Design
Qibin Hou
Daquan Zhou
Jiashi Feng
57
2,992
0
04 Mar 2021
Transformer in Transformer
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
362
1,544
0
27 Feb 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction
  without Convolutions
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
450
3,678
0
24 Feb 2021
Tokens-to-Token ViT: Training Vision Transformers from Scratch on
  ImageNet
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
99
1,918
0
28 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
250
2,463
0
04 Jan 2021
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
  with Transformers
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
127
2,872
0
31 Dec 2020
FcaNet: Frequency Channel Attention Networks
FcaNet: Frequency Channel Attention Networks
Zequn Qin
Pengyi Zhang
Leilei Gan
Xi Li
65
689
0
22 Dec 2020
PCT: Point cloud transformer
PCT: Point cloud transformer
Meng-Hao Guo
Junxiong Cai
Zheng-Ning Liu
Tai-Jiang Mu
Ralph Robert Martin
Shimin Hu
ViT
3DPC
114
1,599
0
17 Dec 2020
GTA: Global Temporal Attention for Video Action Understanding
GTA: Global Temporal Attention for Video Action Understanding
Bo He
Xitong Yang
Zuxuan Wu
Hao Chen
Ser-Nam Lim
Abhinav Shrivastava
ViT
58
27
0
15 Dec 2020
Pre-Trained Image Processing Transformer
Pre-Trained Image Processing Transformer
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLM
ViT
123
1,659
0
01 Dec 2020
Point Transformer
Point Transformer
Nico Engel
Vasileios Belagiannis
Klaus C. J. Dietmayer
3DPC
147
1,972
0
02 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
400
40,217
0
22 Oct 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
164
4,993
0
08 Oct 2020
Rotate to Attend: Convolutional Triplet Attention Module
Rotate to Attend: Convolutional Triplet Attention Module
Diganta Misra
Trikay Nalamada
Ajay Uppili Arasanipalai
Qibin Hou
ViT
3DPC
44
586
0
06 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
164
1,323
0
03 Oct 2020
Rethinking Attention with Performers
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
144
1,548
0
30 Sep 2020
1234
Next