Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.04560
Cited By
Scaling Vision Transformers
8 June 2021
Xiaohua Zhai
Alexander Kolesnikov
N. Houlsby
Lucas Beyer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Vision Transformers"
50 / 751 papers shown
Title
Combined Scaling for Zero-shot Transfer Learning
Hieu H. Pham
Zihang Dai
Golnaz Ghiasi
Kenji Kawaguchi
Hanxiao Liu
...
Yi-Ting Chen
Minh-Thang Luong
Yonghui Wu
Mingxing Tan
Quoc V. Le
VLM
17
193
0
19 Nov 2021
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
60
1,309
0
18 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
67
1,747
0
18 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
18
34
0
16 Nov 2021
LiT: Zero-Shot Transfer with Locked-image text Tuning
Xiaohua Zhai
Tianlin Li
Basil Mustafa
Andreas Steiner
Daniel Keysers
Alexander Kolesnikov
Lucas Beyer
VLM
48
543
0
15 Nov 2021
Scaling Law for Recommendation Models: Towards General-purpose User Representations
Kyuyong Shin
Hanock Kwak
KyungHyun Kim
Max Nihlén Ramström
Jisu Jeong
Jung-Woo Ha
S. Kim
ELM
36
38
0
15 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
77
330
0
11 Nov 2021
Soft Sensing Transformer: Hundreds of Sensors are Worth a Single Word
Chao Zhang
Jaswanth K. Yella
Yu Huang
Xiaoye Qian
Sergei Petrov
A. Rzhetsky
Sthitie Bom
29
14
0
10 Nov 2021
Are Transformers More Robust Than CNNs?
Yutong Bai
Jieru Mei
Alan Yuille
Cihang Xie
ViT
AAML
192
257
0
10 Nov 2021
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
Christoph Schuhmann
Richard Vencu
Romain Beaumont
R. Kaczmarczyk
Clayton Mullis
Aarush Katta
Theo Coombes
J. Jitsev
Aran Komatsuzaki
VLM
MLLM
CLIP
15
1,377
0
03 Nov 2021
A Survey of Self-Supervised and Few-Shot Object Detection
Gabriel Huang
I. Laradji
David Vazquez
Simon Lacoste-Julien
Pau Rodríguez López
ObjD
27
77
0
27 Oct 2021
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
34
99
0
25 Oct 2021
Sinkformers: Transformers with Doubly Stochastic Attention
Michael E. Sander
Pierre Ablin
Mathieu Blondel
Gabriel Peyré
29
76
0
22 Oct 2021
No One Representation to Rule Them All: Overlapping Features of Training Methods
Raphael Gontijo-Lopes
Yann N. Dauphin
E. D. Cubuk
20
60
0
20 Oct 2021
Understanding and Improving Robustness of Vision Transformers through Patch-based Negative Augmentation
Yao Qin
Chiyuan Zhang
Ting Chen
Balaji Lakshminarayanan
Alex Beutel
Xuezhi Wang
ViT
50
42
0
15 Oct 2021
Transform and Bitstream Domain Image Classification
P. Hill
D. R. Bull
27
4
0
13 Oct 2021
LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Xiaohui Wang
Yang Wei
Ying Xiong
Guyue Huang
Xian Qian
Yufei Ding
Mingxuan Wang
Lei Li
VLM
8
29
0
12 Oct 2021
Investigating Transfer Learning Capabilities of Vision Transformers and CNNs by Fine-Tuning a Single Trainable Block
Durvesh Malpure
Onkar Litake
Rajesh S. Ingle
ViT
19
5
0
11 Oct 2021
Sparse MoEs meet Efficient Ensembles
J. Allingham
F. Wenzel
Zelda E. Mariet
Basil Mustafa
J. Puigcerver
...
Balaji Lakshminarayanan
Jasper Snoek
Dustin Tran
Carlos Riquelme Ruiz
Rodolphe Jenatton
MoE
46
21
0
07 Oct 2021
Exploring the Limits of Large Scale Pre-training
Samira Abnar
Mostafa Dehghani
Behnam Neyshabur
Hanie Sedghi
AI4CE
60
114
0
05 Oct 2021
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
212
487
0
01 Oct 2021
Localizing Objects with Self-Supervised Transformers and no Labels
Oriane Siméoni
Gilles Puy
Huy V. Vo
Simon Roburin
Spyros Gidaris
Andrei Bursuc
P. Pérez
Renaud Marlet
Jean Ponce
ViT
180
196
0
29 Sep 2021
Digital Signal Processing Using Deep Neural Networks
Brian Shevitski
Y. Watkins
Nicole Man
Michael Girard
AI4CE
31
4
0
21 Sep 2021
Compute and Energy Consumption Trends in Deep Learning Inference
Radosvet Desislavov
Fernando Martínez-Plumed
José Hernández-Orallo
35
113
0
12 Sep 2021
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
64
689
0
04 Sep 2021
Towards Efficient and Data Agnostic Image Classification Training Pipeline for Embedded Systems
K. Prokofiev
V. Sovrasov
3DH
19
2
0
16 Aug 2021
Pruning vs XNOR-Net: A Comprehensive Study of Deep Learning for Audio Classification on Edge-devices
Md Mohaimenuzzaman
Christoph Bergmeir
B. Meyer
14
21
0
13 Aug 2021
Go Wider Instead of Deeper
Fuzhao Xue
Ziji Shi
Futao Wei
Yuxuan Lou
Yong Liu
Yang You
ViT
MoE
25
80
0
25 Jul 2021
A Systematic Survey of Text Worlds as Embodied Natural Language Environments
Peter Alexander Jansen
LM&Ro
23
21
0
08 Jul 2021
Multimodal Few-Shot Learning with Frozen Language Models
Maria Tsimpoukelli
Jacob Menick
Serkan Cabi
S. M. Ali Eslami
Oriol Vinyals
Felix Hill
MLLM
55
749
0
25 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
52
313
0
24 Jun 2021
TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?
Michael S. Ryoo
A. Piergiovanni
Anurag Arnab
Mostafa Dehghani
A. Angelova
ViT
37
127
0
21 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
66
2,749
0
15 Jun 2021
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
17
575
0
10 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
49
1,167
0
09 Jun 2021
Self-Supervision is All You Need for Solving Rubik's Cube
Kyo Takano
13
1
0
06 Jun 2021
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images
Mehdi Cherti
J. Jitsev
LM&MA
22
23
0
31 May 2021
Adversarial Robustness against Multiple and Single
l
p
l_p
l
p
-Threat Models via Quick Fine-Tuning of Robust Classifiers
Francesco Croce
Matthias Hein
OOD
AAML
28
18
0
26 May 2021
MLP-Mixer: An all-MLP Architecture for Vision
Ilya O. Tolstikhin
N. Houlsby
Alexander Kolesnikov
Lucas Beyer
Xiaohua Zhai
...
Andreas Steiner
Daniel Keysers
Jakob Uszkoreit
Mario Lucic
Alexey Dosovitskiy
274
2,603
0
04 May 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
347
5,785
0
29 Apr 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
120
209
0
26 Apr 2021
Easy and Efficient Transformer : Scalable Inference Solution For large NLP model
GongZheng Li
Yadong Xi
Jingzhen Ding
Duan Wang
Bai Liu
Changjie Fan
Xiaoxi Mao
Zeng Zhao
6
9
0
26 Apr 2021
The Shape of Learning Curves: a Review
T. Viering
Marco Loog
18
122
0
19 Mar 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
304
3,623
0
24 Feb 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
322
3,708
0
11 Feb 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
290
979
0
27 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,430
0
04 Jan 2021
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
18
2,130
0
23 Dec 2020
Why Do Better Loss Functions Lead to Less Transferable Features?
Simon Kornblith
Ting-Li Chen
Honglak Lee
Mohammad Norouzi
FaML
30
90
0
30 Oct 2020
Meta Pseudo Labels
Hieu H. Pham
Zihang Dai
Qizhe Xie
Minh-Thang Luong
Quoc V. Le
VLM
262
656
0
23 Mar 2020
Previous
1
2
3
...
14
15
16
Next