Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.01169
Cited By
v1
v2
v3
v4
v5 (latest)
Transformers in Vision: A Survey
4 January 2021
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformers in Vision: A Survey"
50 / 263 papers shown
Title
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
496
0
0
06 May 2025
Remote sensing colour image semantic segmentation of trails created by large herbivorous Mammals
J. Díez-Pastor
Francisco Javier Gonzalez-Moya
Pedro Latorre-Carmona
Francisco Javier Perez-Barbería
Ludmila I.Kuncheva
Antonio Canepa-Oneto
Alvar Arnaiz-González
C. García-Osorio
271
0
0
16 Apr 2025
Embedding Radiomics into Vision Transformers for Multimodal Medical Image Classification
Zhenyu Yang
Haiming Zhu
Rihui Zhang
Haipeng Zhang
Jianliang Wang
Chunhao Wang
Minbin Chen
F. Yin
MedIm
121
0
0
15 Apr 2025
A super-resolution reconstruction method for lightweight building images based on an expanding feature modulation network
Yi Zhang
Wenye Zhou
Ruonan Lin
SupR
118
0
0
17 Mar 2025
MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation
Anzhe Cheng
Chenzhong Yin
Yu Chang
Heng Ping
Shixuan Li
Shahin Nazarian
Paul Bogdan
SSeg
263
0
0
11 Mar 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao Song
Yufa Zhou
165
19
0
21 Feb 2025
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection
Yuming Chen
Xinbin Yuan
Ruiqi Wu
Jiabao Wang
Qibin Hou
Mingg-Ming Cheng
Ming-Ming Cheng
ObjD
275
53
0
21 Feb 2025
Infrared Image Super-Resolution: Systematic Review, and Future Trends
Y. Huang
Tomo Miyazaki
Xiao-Fang Liu
S. Omachi
SupR
152
13
0
21 Feb 2025
Learning county from pixels: Corn yield prediction with attention-weighted multiple instance learning
Xiaoyu Wang
Yuchi Ma
Qunying Huang
Zhengwei Yang
Zhou Zhang
199
1
0
17 Feb 2025
TLOB: A Novel Transformer Model with Dual Attention for Price Trend Prediction with Limit Order Book Data
Leonardo Berti
Gjergji Kasneci
AI4TS
187
1
0
12 Feb 2025
MoENAS: Mixture-of-Expert based Neural Architecture Search for jointly Accurate, Fair, and Robust Edge Deep Neural Networks
Lotfi Abdelkrim Mecharbat
Alberto Marchisio
Mohamed Bennai
M. Ghassemi
Tuka Alhanai
181
0
0
11 Feb 2025
Protego: Detecting Adversarial Examples for Vision Transformers via Intrinsic Capabilities
Jialin Wu
Kaikai Pan
Yanjiao Chen
Jiangyi Deng
Shengyuan Pang
Wenyuan Xu
ViT
AAML
107
0
0
13 Jan 2025
Generation from Noisy Examples
A. Raman
Vinod Raman
91
1
0
07 Jan 2025
Semantics Prompting Data-Free Quantization for Low-Bit Vision Transformers
Mingliang Xu
Yuyao Zhou
Yuxin Zhang
Shen Li
Yong Li
Chia-Wen Lin
Zhanpeng Zeng
Rongrong Ji
MQ
316
0
0
31 Dec 2024
Vision Transformers for Efficient Indoor Pathloss Radio Map Prediction
Rafayel Mkrtchyan
Edvard Ghukasyan
Khoren Petrosyan
Hrant Khachatrian
Theofanis P. Raptis
135
0
0
12 Dec 2024
Passive Deepfake Detection Across Multi-modalities: A Comprehensive Survey
Hong-Hanh Nguyen-Le
Van-Tuan Tran
Dinh-Thuc Nguyen
Nhien-An Le-Khac
AAML
172
2
0
26 Nov 2024
HMT-Grasp: A Hybrid Mamba-Transformer Approach for Robot Grasping in Cluttered Environments
Songsong Xiong
Hamidreza Kasaei
99
1
0
04 Oct 2024
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
143
0
0
27 May 2024
A Survey on Vision-Language-Action Models for Embodied AI
Yueen Ma
Zixing Song
Yuzheng Zhuang
Jianye Hao
Irwin King
LM&Ro
307
54
0
23 May 2024
Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration
Jingyun Xue
Tao Wang
Jun Wang
Kaihao Zhang
ViT
85
2
0
09 Mar 2024
Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
Amin Karimi Monsefi
Payam Karisani
Mengxi Zhou
Stacey S. Choi
Nathan Doble
Heng Ji
Srinivasan Parthasarathy
R. Ramnath
85
5
0
09 Feb 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Junlong Du
Yue Fan
Qing Li
Qing Li
Yuntao Du
VLM
143
85
0
03 Feb 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
112
6
0
22 Jan 2024
TouchUp-G: Improving Feature Representation through Graph-Centric Finetuning
Jing Zhu
Xiang Song
V. Ioannidis
Danai Koutra
Christos Faloutsos
151
15
0
25 Sep 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
135
4
0
18 Aug 2023
Local-Aware Global Attention Network for Person Re-Identification Based on Body and Hand Images
N. L. Baisa
CVBM
86
4
0
11 Sep 2022
Unsupervised Domain Adaptation via Style-Aware Self-intermediate Domain
Lianyu Wang
Meng Wang
Daoqiang Zhang
Huazhu Fu
91
2
0
05 Sep 2022
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
298
30,152
0
01 Mar 2022
Patches Are All You Need?
Asher Trockman
J. Zico Kolter
ViT
268
411
0
24 Jan 2022
Restormer: Efficient Transformer for High-Resolution Image Restoration
Syed Waqas Zamir
Aditya Arora
Salman Khan
Munawar Hayat
Fahad Shahbaz Khan
Ming-Hsuan Yang
ViT
182
2,249
0
18 Nov 2021
Pix2seq: A Language Modeling Framework for Object Detection
Ting-Li Chen
Saurabh Saxena
Lala Li
David J. Fleet
Geoffrey E. Hinton
MLLM
ViT
VLM
279
350
0
22 Sep 2021
SwinIR: Image Restoration Using Swin Transformer
Christos Sakaridis
Jie Cao
Guolei Sun
Peng Sun
Luc Van Gool
Radu Timofte
ViT
196
2,956
0
23 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
93
584
0
30 Jul 2021
GLiT: Neural Architecture Search for Global and Local Image Transformer
Boyu Chen
Peixia Li
Chuming Li
Baopu Li
Lei Bai
Chen Lin
Ming Sun
Junjie Yan
Wanli Ouyang
ViT
103
86
0
07 Jul 2021
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows
Xiaoyi Dong
Jianmin Bao
Dongdong Chen
Weiming Zhang
Nenghai Yu
Lu Yuan
Dong Chen
B. Guo
ViT
154
986
0
01 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
97
267
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
80
436
0
01 Jul 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
122
1,682
0
25 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
111
231
0
22 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
106
213
0
17 Jun 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
151
513
0
17 Jun 2021
Long-Short Temporal Contrastive Learning of Video Transformers
Jue Wang
Gedas Bertasius
Du Tran
Lorenzo Torresani
VLM
ViT
125
50
0
17 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
141
1,212
0
09 Jun 2021
Scaling Vision Transformers
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
144
1,098
0
08 Jun 2021
On Improving Adversarial Transferability of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Fahad Shahbaz Khan
Fatih Porikli
ViT
91
95
0
08 Jun 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer
Zilong Huang
Youcheng Ben
Guozhong Luo
Pei Cheng
Gang Yu
Bin-Bin Fu
ViT
87
183
0
07 Jun 2021
Uformer: A General U-Shaped Transformer for Image Restoration
Zhendong Wang
Xiaodong Cun
Jianmin Bao
Wengang Zhou
Jianzhuang Liu
Houqiang Li
ViT
119
1,419
0
06 Jun 2021
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Muchen Li
Leonid Sigal
ObjD
105
193
0
06 Jun 2021
RegionViT: Regional-to-Local Attention for Vision Transformers
Chun-Fu Chen
Yikang Shen
Quanfu Fan
ViT
118
197
0
04 Jun 2021
When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
Xiangning Chen
Cho-Jui Hsieh
Boqing Gong
ViT
93
329
0
03 Jun 2021
1
2
3
4
5
6
Next