Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.06004
Cited By
A Comprehensive Survey of Transformers for Computer Vision
11 November 2022
Sonain Jamil
Md. Jalil Piran
Oh-Jin Kwon
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Comprehensive Survey of Transformers for Computer Vision"
48 / 48 papers shown
Title
CENet: Context Enhancement Network for Medical Image Segmentation
Afshin Bozorgpour
Sina Ghorbani Kolahi
Reza Azad
Ilker Hacihaliloglu
Dorit Merhof
183
0
0
23 May 2025
Bridged Transformer for Vision and Point Cloud 3D Object Detection
Yikai Wang
Tengqi Ye
Lele Cao
Wen-bing Huang
Gang Hua
Fengxiang He
Dacheng Tao
ViT
73
34
0
04 Oct 2022
UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation
Ali Hatamizadeh
Ziyue Xu
Dong Yang
Wenqi Li
H. Roth
Daguang Xu
ViT
MedIm
48
29
0
01 Apr 2022
AnoViT: Unsupervised Anomaly Detection and Localization with Vision Transformer-based Encoder-Decoder
Yunseung Lee
Pilsung Kang
ViT
71
76
0
21 Mar 2022
Spatio-temporal Vision Transformer for Super-resolution Microscopy
Charles N Christensen
M. Lu
Edward N. Ward
Pietro Lio
C. Kaminski
45
8
0
28 Feb 2022
Entroformer: A Transformer-based Entropy Model for Learned Image Compression
Yichen Qian
Ming Lin
Xiuyu Sun
Zhiyu Tan
Rong Jin
ViT
59
140
0
11 Feb 2022
ViTBIS: Vision Transformer for Biomedical Image Segmentation
Abhinav Sagar
ViT
MedIm
77
18
0
15 Jan 2022
LAVT: Language-Aware Vision Transformer for Referring Image Segmentation
Zhao Yang
Jiaqi Wang
Yansong Tang
Kai-xiang Chen
Hengshuang Zhao
Philip Torr
204
326
0
04 Dec 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
67
168
0
22 Nov 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
454
7,739
0
11 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
136
350
0
11 Nov 2021
Do Vision Transformers See Like Convolutional Neural Networks?
M. Raghu
Thomas Unterthiner
Simon Kornblith
Chiyuan Zhang
Alexey Dosovitskiy
ViT
117
955
0
19 Aug 2021
Vision Transformer for femur fracture classification
L. Tanzi
A. Audisio
G. Cirrincione
A. Aprato
E. Vezzetti
MedIm
54
64
0
07 Aug 2021
Combining EfficientNet and Vision Transformers for Video Deepfake Detection
D. Coccomini
Nicola Messina
Claudio Gennaro
Fabrizio Falchi
ViT
76
172
0
06 Jul 2021
The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification
Ujjwal Baid
S. Ghodasara
S. Mohan
Michel Bilello
Evan Calabrese
...
M. Weber
A. Mahajan
Bjoern Menze
Adam Flanders
Spyridon Bakas
136
632
0
05 Jul 2021
Early Convolutions Help Transformers See Better
Tete Xiao
Mannat Singh
Eric Mintun
Trevor Darrell
Piotr Dollár
Ross B. Girshick
52
768
0
28 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
258
2,824
0
15 Jun 2021
The Medical Segmentation Decathlon
Michela Antonelli
Annika Reinke
Spyridon Bakas
Keyvan Farahani
AnnetteKopp-Schneider
...
Zhanwei Xu
Yefeng Zheng
Amber L. Simpson
Lena Maier-Hein
M. Jorge Cardoso
OOD
106
983
0
10 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
112
1,204
0
09 Jun 2021
TED-net: Convolution-free T2T Vision Transformer-based Encoder-decoder Dilation network for Low-dose CT Denoising
Dayang Wang
Zhan Wu
Hengyong Yu
ViT
MedIm
50
53
0
08 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
148
1,124
0
08 Jun 2021
You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection
Yuxin Fang
Bencheng Liao
Xinggang Wang
Jiemin Fang
Jiyang Qi
Rui Wu
Jianwei Niu
Wenyu Liu
ViT
66
323
0
01 Jun 2021
Intriguing Properties of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
Fahad Shahbaz Khan
Ming-Hsuan Yang
ViT
313
647
0
21 May 2021
VT-ADL: A Vision Transformer Network for Image Anomaly Detection and Localization
P. Mishra
Riccardo Verk
Daniele Fornasier
C. Piciarelli
G. Foresti
ViT
102
300
0
20 Apr 2021
Group-Free 3D Object Detection via Transformers
Ze Liu
Zheng Zhang
Yue Cao
Han Hu
Xin Tong
ViT
3DPC
85
313
0
01 Apr 2021
Rethinking Spatial Dimensions of Vision Transformers
Byeongho Heo
Sangdoo Yun
Dongyoon Han
Sanghyuk Chun
Junsuk Choe
Seong Joon Oh
ViT
494
581
0
30 Mar 2021
CvT: Introducing Convolutions to Vision Transformers
Haiping Wu
Bin Xiao
Noel Codella
Mengchen Liu
Xiyang Dai
Lu Yuan
Lei Zhang
ViT
152
1,909
0
29 Mar 2021
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Chun-Fu Chen
Quanfu Fan
Yikang Shen
ViT
71
1,478
0
27 Mar 2021
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
Ze Liu
Yutong Lin
Yue Cao
Han Hu
Yixuan Wei
Zheng Zhang
Stephen Lin
B. Guo
ViT
444
21,418
0
25 Mar 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
365
2,048
0
09 Feb 2021
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
Jieneng Chen
Yongyi Lu
Qihang Yu
Xiangde Luo
Ehsan Adeli
Yan Wang
Le Lu
Alan Yuille
Yuyin Zhou
ViT
MedIm
98
3,473
0
08 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
302
2,514
0
04 Jan 2021
Toward Transformer-Based Object Detection
Josh Beal
Eric Kim
Eric Tzeng
Dong Huk Park
Andrew Zhai
Dmitry Kislyuk
ViT
86
215
0
17 Dec 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
637
41,003
0
22 Oct 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
149
1,122
0
14 Sep 2020
Visual Transformers: Token-based Image Representation and Processing for Computer Vision
Bichen Wu
Chenfeng Xu
Xiaoliang Dai
Alvin Wan
Peizhao Zhang
Zhicheng Yan
Masayoshi Tomizuka
Joseph E. Gonzalez
Kurt Keutzer
Peter Vajda
ViT
98
559
0
05 Jun 2020
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
Mingxing Tan
Quoc V. Le
3DV
MedIm
137
18,115
0
28 May 2019
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney
Dipanjan Das
Ellie Pavlick
MILM
SSeg
135
1,471
0
15 May 2019
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler
Andrew G. Howard
Menglong Zhu
A. Zhmoginov
Liang-Chieh Chen
181
19,271
0
13 Jan 2018
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Xiangyu Zhang
Xinyu Zhou
Mengxiao Lin
Jian Sun
AI4TS
138
6,867
0
04 Jul 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
690
131,526
0
12 Jun 2017
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Angela Dai
Angel X. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
3DPC
3DV
469
4,058
0
14 Feb 2017
Wider or Deeper: Revisiting the ResNet Model for Visual Recognition
Zifeng Wu
Chunhua Shen
Anton Van Den Hengel
SSeg
633
1,515
0
30 Nov 2016
Xception: Deep Learning with Depthwise Separable Convolutions
François Chollet
MDE
BDL
PINN
1.4K
14,559
0
07 Oct 2016
Modeling Context in Referring Expressions
Licheng Yu
Patrick Poirson
Shan Yang
Alexander C. Berg
Tamara L. Berg
129
1,263
0
31 Jul 2016
SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size
F. Iandola
Song Han
Matthew W. Moskewicz
Khalid Ashraf
W. Dally
Kurt Keutzer
150
7,482
0
24 Feb 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
193,878
0
10 Dec 2015
Generation and Comprehension of Unambiguous Object Descriptions
Junhua Mao
Jonathan Huang
Alexander Toshev
Oana-Maria Camburu
Alan Yuille
Kevin Patrick Murphy
ObjD
118
1,345
0
07 Nov 2015
1