Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.12877
Cited By
Training data-efficient image transformers & distillation through attention
23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training data-efficient image transformers & distillation through attention"
50 / 1,164 papers shown
Title
Feature Fusion Vision Transformer for Fine-Grained Visual Categorization
Jun Wang
Xiaohan Yu
Yongsheng Gao
ViT
35
105
0
06 Jul 2021
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
Zhiwei Hao
Jianyuan Guo
Ding Jia
Kai Han
Yehui Tang
Chao Zhang
Dacheng Tao
Yunhe Wang
ViT
33
68
0
03 Jul 2021
AutoFormer: Searching Transformers for Visual Recognition
Minghao Chen
Houwen Peng
Jianlong Fu
Haibin Ling
ViT
36
259
0
01 Jul 2021
Global Filter Networks for Image Classification
Yongming Rao
Wenliang Zhao
Zheng Zhu
Jiwen Lu
Jie Zhou
ViT
22
450
0
01 Jul 2021
Focal Self-attention for Local-Global Interactions in Vision Transformers
Jianwei Yang
Chunyuan Li
Pengchuan Zhang
Xiyang Dai
Bin Xiao
Lu Yuan
Jianfeng Gao
ViT
42
428
0
01 Jul 2021
Improving the Efficiency of Transformers for Resource-Constrained Devices
Hamid Tabani
Ajay Balasubramaniam
Shabbir Marzban
Elahe Arani
Bahram Zonooz
33
20
0
30 Jun 2021
Rethinking Token-Mixing MLP for MLP-based Vision Backbone
Tan Yu
Xu Li
Yunfeng Cai
Mingming Sun
Ping Li
45
26
0
28 Jun 2021
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
D. Mocanu
OOD
28
49
0
28 Jun 2021
Post-Training Quantization for Vision Transformer
Zhenhua Liu
Yunhe Wang
Kai Han
Siwei Ma
Wen Gao
ViT
MQ
41
325
0
27 Jun 2021
PVT v2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
AI4TS
29
1,607
0
25 Jun 2021
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training
Hongwei Xue
Yupan Huang
Bei Liu
Houwen Peng
Jianlong Fu
Houqiang Li
Jiebo Luo
22
88
0
25 Jun 2021
DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval
Giorgos Kordopatis-Zilos
Christos Tzelepis
Symeon Papadopoulos
I. Kompatsiaris
Ioannis Patras
27
33
0
24 Jun 2021
Exploring Corruption Robustness: Inductive Biases in Vision Transformers and MLP-Mixers
Katelyn Morrison
B. Gilby
Colton Lipchak
Adam Mattioli
Adriana Kovashka
ViT
28
17
0
24 Jun 2021
IA-RED
2
^2
2
: Interpretability-Aware Redundancy Reduction for Vision Transformers
Bowen Pan
Rameswar Panda
Yifan Jiang
Zhangyang Wang
Rogerio Feris
A. Oliva
VLM
ViT
39
153
0
23 Jun 2021
Co-advise: Cross Inductive Bias Distillation
Sucheng Ren
Zhengqi Gao
Tianyu Hua
Zihui Xue
Yonglong Tian
Shengfeng He
Hang Zhao
44
53
0
23 Jun 2021
Vision Permutator: A Permutable MLP-Like Architecture for Visual Recognition
Qibin Hou
Zihang Jiang
Li-xin Yuan
Mingg-Ming Cheng
Shuicheng Yan
Jiashi Feng
ViT
MLLM
24
205
0
23 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
29
219
0
22 Jun 2021
Towards Biologically Plausible Convolutional Networks
Roman Pogodin
Yash Mehta
Timothy Lillicrap
P. Latham
26
22
0
22 Jun 2021
Structured Sparse R-CNN for Direct Scene Graph Generation
Yao Teng
Limin Wang
3DPC
GNN
26
53
0
21 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
39
614
0
18 Jun 2021
Efficient Self-supervised Vision Transformers for Representation Learning
Chunyuan Li
Jianwei Yang
Pengchuan Zhang
Mei Gao
Bin Xiao
Xiyang Dai
Lu Yuan
Jianfeng Gao
ViT
32
209
0
17 Jun 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
36
497
0
17 Jun 2021
Shuffle Transformer with Feature Alignment for Video Face Parsing
Rui Zhang
Yang Han
Zilong Huang
Pei Cheng
Guozhong Luo
Gang Yu
Bin-Bin Fu
CVBM
ViT
30
1
0
16 Jun 2021
Physion: Evaluating Physical Prediction from Vision in Humans and Machines
Daniel M. Bear
E. Wang
Damian Mrowca
Felix Binder
Hsiau-Yu Fish Tung
...
Li Fei-Fei
Nancy Kanwisher
J. Tenenbaum
Daniel L. K. Yamins
Judith E. Fan
OOD
58
86
0
15 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
54
2,747
0
15 Jun 2021
Improved Transformer for High-Resolution GANs
Long Zhao
Zizhao Zhang
Ting Chen
Dimitris N. Metaxas
Han Zhang
ViT
29
95
0
14 Jun 2021
Delving Deep into the Generalization of Vision Transformers under Distribution Shifts
Chongzhi Zhang
Mingyuan Zhang
Shanghang Zhang
Daisheng Jin
Qiang-feng Zhou
Zhongang Cai
Haiyu Zhao
Xianglong Liu
Ziwei Liu
18
102
0
14 Jun 2021
Survey: Image Mixing and Deleting for Data Augmentation
Humza Naveed
Saeed Anwar
Munawar Hayat
Kashif Javed
Ajmal Mian
35
78
0
13 Jun 2021
Space-time Mixing Attention for Video Transformer
Adrian Bulat
Juan-Manuel Perez-Rua
Swathikiran Sudhakaran
Brais Martínez
Georgios Tzimiropoulos
ViT
27
124
0
10 Jun 2021
Scaling Vision with Sparse Mixture of Experts
C. Riquelme
J. Puigcerver
Basil Mustafa
Maxim Neumann
Rodolphe Jenatton
André Susano Pinto
Daniel Keysers
N. Houlsby
MoE
12
575
0
10 Jun 2021
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
27
149
0
10 Jun 2021
MST: Masked Self-Supervised Transformer for Visual Representation
Zhaowen Li
Zhiyang Chen
Fan Yang
Wei Li
Yousong Zhu
...
Rui Deng
Liwei Wu
Rui Zhao
Ming Tang
Jinqiao Wang
ViT
37
162
0
10 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
8
274
0
09 Jun 2021
Towards Training Stronger Video Vision Transformers for EPIC-KITCHENS-100 Action Recognition
Ziyuan Huang
Zhiwu Qing
Xiang Wang
Yutong Feng
Shiwei Zhang
Jianwen Jiang
Zhurong Xia
Mingqian Tang
Nong Sang
M. Ang
ViT
21
11
0
09 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
49
1,167
0
09 Jun 2021
MVT: Mask Vision Transformer for Facial Expression Recognition in the wild
Hanting Li
Ming-Fa Sui
Feng Zhao
Zhengjun Zha
Feng Wu
ViT
34
75
0
08 Jun 2021
Fully Transformer Networks for Semantic Image Segmentation
Sitong Wu
Tianyi Wu
Fangjian Lin
Sheng Tian
Guodong Guo
ViT
34
39
0
08 Jun 2021
Person Re-Identification with a Locally Aware Transformer
Charu Sharma
S. R. Kapil
David Chapman
ViT
42
45
0
07 Jun 2021
Self-supervised Depth Estimation Leveraging Global Perception and Geometric Smoothness Using On-board Videos
Shaocheng Jia
Xin Pei
W. Yao
S. Wong
3DPC
MDE
38
19
0
07 Jun 2021
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias
Yufei Xu
Qiming Zhang
Jing Zhang
Dacheng Tao
ViT
53
329
0
07 Jun 2021
Rethinking Training from Scratch for Object Detection
Yang Li
Hong Zhang
Yu Zhang
VLM
OnRL
ObjD
25
5
0
06 Jun 2021
CATs: Cost Aggregation Transformers for Visual Correspondence
Seokju Cho
Sunghwan Hong
Sangryul Jeon
Yunsung Lee
K. Sohn
Seungryong Kim
ViT
26
86
0
04 Jun 2021
Few-Shot Segmentation via Cycle-Consistent Transformer
Gengwei Zhang
Guoliang Kang
Yi Yang
Yunchao Wei
ViT
19
177
0
04 Jun 2021
Scalable Transformers for Neural Machine Translation
Peng Gao
Shijie Geng
Yu Qiao
Xiaogang Wang
Jifeng Dai
Hongsheng Li
31
13
0
04 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
29
22
0
31 May 2021
Dual-stream Network for Visual Recognition
Mingyuan Mao
Renrui Zhang
Honghui Zheng
Peng Gao
Teli Ma
Yan Peng
Errui Ding
Baochang Zhang
Shumin Han
ViT
25
63
0
31 May 2021
FoveaTer: Foveated Transformer for Image Classification
Aditya Jonnalagadda
W. Wang
B. S. Manjunath
M. Eckstein
ViT
17
23
0
29 May 2021
What Is Considered Complete for Visual Recognition?
Lingxi Xie
Xiaopeng Zhang
Longhui Wei
Jianlong Chang
Qi Tian
VLM
23
4
0
28 May 2021
ResT: An Efficient Transformer for Visual Recognition
Qing-Long Zhang
Yubin Yang
ViT
29
229
0
28 May 2021
KVT: k-NN Attention for Boosting Vision Transformers
Pichao Wang
Xue Wang
F. Wang
Ming Lin
Shuning Chang
Hao Li
R. L. Jin
ViT
48
105
0
28 May 2021
Previous
1
2
3
...
21
22
23
24
Next