Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.01169
Cited By
v1
v2
v3
v4
v5 (latest)
Transformers in Vision: A Survey
4 January 2021
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transformers in Vision: A Survey"
50 / 263 papers shown
Title
SceneFormer: Indoor Scene Generation with Transformers
Xinpeng Wang
Chandan Yeshwanth
Matthias Nießner
ViT
3DPC
54
154
0
17 Dec 2020
End-to-End Human Pose and Mesh Reconstruction with Transformers
Kevin Qinghong Lin
Lijuan Wang
Zicheng Liu
ViT
82
627
0
17 Dec 2020
PCT: Point cloud transformer
Meng-Hao Guo
Junxiong Cai
Zheng-Ning Liu
Tai-Jiang Mu
Ralph Robert Martin
Shimin Hu
ViT
3DPC
194
1,630
0
17 Dec 2020
Spatial Temporal Transformer Network for Skeleton-based Action Recognition
Chiara Plizzari
Marco Cannici
Matteo Matteucci
ViT
55
200
0
11 Dec 2020
Topological Planning with Transformers for Vision-and-Language Navigation
Kevin Chen
Junshen K. Chen
Jo Chuang
Nathan Tsoi
Silvio Savarese
LM&Ro
87
101
0
09 Dec 2020
Parameter Efficient Multimodal Transformers for Video Representation Learning
Sangho Lee
Youngjae Yu
Gunhee Kim
Thomas Breuel
Jan Kautz
Yale Song
ViT
104
78
0
08 Dec 2020
Pre-Trained Image Processing Transformer
Hanting Chen
Yunhe Wang
Tianyu Guo
Chang Xu
Yiping Deng
Zhenhua Liu
Siwei Ma
Chunjing Xu
Chao Xu
Wen Gao
VLM
ViT
147
1,681
0
01 Dec 2020
End-to-End Video Instance Segmentation with Transformers
Yuqing Wang
Zhaoliang Xu
Xinlong Wang
Chunhua Shen
Baoshan Cheng
Hao Shen
Huaxia Xia
ViT
94
692
0
30 Nov 2020
Attention-Based Transformers for Instance Segmentation of Cells in Microstructures
Tim Prangemeier
Christoph Reich
Heinz Koeppl
MedIm
ViT
58
93
0
19 Nov 2020
Point Transformer
Nico Engel
Vasileios Belagiannis
Klaus C. J. Dietmayer
3DPC
186
2,008
0
02 Nov 2020
COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning
Simon Ging
Mohammadreza Zolfaghari
Hamed Pirsiavash
Thomas Brox
ViT
CLIP
75
172
0
01 Nov 2020
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
684
41,483
0
22 Oct 2020
Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision
Hao Tan
Joey Tianyi Zhou
CLIP
83
121
0
14 Oct 2020
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Xizhou Zhu
Weijie Su
Lewei Lu
Bin Li
Xiaogang Wang
Jifeng Dai
ViT
260
5,102
0
08 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
199
1,359
0
03 Oct 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
186
1,604
0
30 Sep 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
170
1,128
0
14 Sep 2020
Single Image Super-Resolution via a Holistic Attention Network
Ben Niu
Weilei Wen
Wenqi Ren
Xiangde Zhang
Lianping Yang
Shuzhen Wang
Kaihao Zhang
Xiaochun Cao
Haifeng Shen
SupR
72
631
0
20 Aug 2020
CrossTransformers: spatially-aware few-shot transfer
Carl Doersch
Ankush Gupta
Andrew Zisserman
ViT
285
336
0
22 Jul 2020
FTRANS: Energy-Efficient Acceleration of Transformers using FPGA
Bingbing Li
Santosh Pandey
Haowen Fang
Yanjun Lyv
Ji Li
Jieyang Chen
Mimi Xie
Lipeng Wan
Hang Liu
Caiwen Ding
AI4CE
69
179
0
16 Jul 2020
Fast Transformers with Clustered Attention
Apoorv Vyas
Angelos Katharopoulos
Franccois Fleuret
62
154
0
09 Jul 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
132
1,191
0
30 Jun 2020
A Universal Representation Transformer Layer for Few-Shot Image Classification
Lu Liu
William L. Hamilton
Guodong Long
Jing Jiang
Hugo Larochelle
ViT
101
127
0
21 Jun 2020
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
155
1,635
0
15 Jun 2020
Bootstrap your own latent: A new approach to self-supervised Learning
Jean-Bastien Grill
Florian Strub
Florent Altché
Corentin Tallec
Pierre Harvey Richemond
...
M. G. Azar
Bilal Piot
Koray Kavukcuoglu
Rémi Munos
Michal Valko
SSL
405
6,844
0
13 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSL
VLM
165
436
0
11 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
219
1,716
0
08 Jun 2020
Learning Texture Transformer Network for Image Super-Resolution
Fuzhi Yang
Huan Yang
Jianlong Fu
Hongtao Lu
B. Guo
SupR
ViT
77
726
0
07 Jun 2020
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Hanrui Wang
Zhanghao Wu
Zhijian Liu
Han Cai
Ligeng Zhu
Chuang Gan
Song Han
91
262
0
28 May 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
904
42,463
0
28 May 2020
End-to-End Object Detection with Transformers
Nicolas Carion
Francisco Massa
Gabriel Synnaeve
Nicolas Usunier
Alexander Kirillov
Sergey Zagoruyko
ViT
3DV
PINN
454
13,130
0
26 May 2020
Quantifying Attention Flow in Transformers
Samira Abnar
Willem H. Zuidema
169
803
0
02 May 2020
Synthesizer: Rethinking Self-Attention in Transformer Models
Yi Tay
Dara Bahri
Donald Metzler
Da-Cheng Juan
Zhe Zhao
Che Zheng
64
339
0
02 May 2020
Improving Vision-and-Language Navigation with Image-Text Pairs from the Web
Arjun Majumdar
Ayush Shrivastava
Stefan Lee
Peter Anderson
Devi Parikh
Dhruv Batra
LM&Ro
166
235
0
30 Apr 2020
Exploring Self-attention for Image Recognition
Hengshuang Zhao
Jiaya Jia
V. Koltun
SSL
98
786
0
28 Apr 2020
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Xiujun Li
Xi Yin
Chunyuan Li
Pengchuan Zhang
Xiaowei Hu
...
Houdong Hu
Li Dong
Furu Wei
Yejin Choi
Jianfeng Gao
VLM
148
1,947
0
13 Apr 2020
Longformer: The Long-Document Transformer
Iz Beltagy
Matthew E. Peters
Arman Cohan
RALM
VLM
187
4,100
0
10 Apr 2020
Designing Network Design Spaces
Ilija Radosavovic
Raj Prateek Kosaraju
Ross B. Girshick
Kaiming He
Piotr Dollár
GNN
107
1,697
0
30 Mar 2020
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation
Huiyu Wang
Yukun Zhu
Bradley Green
Hartwig Adam
Alan Yuille
Liang-Chieh Chen
3DPC
132
673
0
17 Mar 2020
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
508
3,444
0
09 Mar 2020
Sparse Sinkhorn Attention
Yi Tay
Dara Bahri
Liu Yang
Donald Metzler
Da-Cheng Juan
91
342
0
26 Feb 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
90
282
0
25 Feb 2020
A Simple Framework for Contrastive Learning of Visual Representations
Ting-Li Chen
Simon Kornblith
Mohammad Norouzi
Geoffrey E. Hinton
SSL
390
18,897
0
13 Feb 2020
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
207
2,333
0
13 Jan 2020
Axial Attention in Multidimensional Transformers
Jonathan Ho
Nal Kalchbrenner
Dirk Weissenborn
Tim Salimans
110
533
0
20 Dec 2019
Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation
Gedas Bertasius
Lorenzo Torresani
93
179
0
10 Dec 2019
Analyzing and Improving the Image Quality of StyleGAN
Tero Karras
S. Laine
M. Aittala
Janne Hellsten
J. Lehtinen
Timo Aila
GAN
329
5,829
0
03 Dec 2019
Momentum Contrast for Unsupervised Visual Representation Learning
Kaiming He
Haoqi Fan
Yuxin Wu
Saining Xie
Ross B. Girshick
SSL
216
12,136
0
13 Nov 2019
On the Relationship between Self-Attention and Convolutional Layers
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
116
535
0
08 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
503
20,342
0
23 Oct 2019
Previous
1
2
3
4
5
6
Next