Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.12877
Cited By
Training data-efficient image transformers & distillation through attention
23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Training data-efficient image transformers & distillation through attention"
50 / 1,268 papers shown
Title
INDIGO: Intrinsic Multimodality for Domain Generalization
Puneet Mangla
Shivam Chandhok
Milan Aggarwal
V. Balasubramanian
Balaji Krishnamurthy
VLM
41
2
0
13 Jun 2022
NR-DFERNet: Noise-Robust Network for Dynamic Facial Expression Recognition
Hanting Li
Ming-Fa Sui
Zhaoqing Zhu
Feng Zhao
25
27
0
10 Jun 2022
Masked Autoencoders are Robust Data Augmentors
Haohang Xu
Shuangrui Ding
Xiaopeng Zhang
H. Xiong
35
27
0
10 Jun 2022
GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen
Gaurav Mittal
Ye Yu
Yu Kong
Mei Chen
41
33
0
09 Jun 2022
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
35
22
0
09 Jun 2022
SimVP: Simpler yet Better Video Prediction
Zhangyang Gao
Cheng Tan
Lirong Wu
Stan Z. Li
40
211
0
09 Jun 2022
Revealing Single Frame Bias for Video-and-Language Learning
Jie Lei
Tamara L. Berg
Joey Tianyi Zhou
24
111
0
07 Jun 2022
IL-MCAM: An interactive learning and multi-channel attention mechanism-based weakly supervised colorectal histopathology image classification approach
Hao Chen
Chen Li
Xirong Li
M. Rahaman
Weiming Hu
...
Wanli Liu
Changhao Sun
Hongzan Sun
Xinyu Huang
M. Grzegorzek
HAI
35
99
0
07 Jun 2022
Separable Self-attention for Mobile Vision Transformers
Sachin Mehta
Mohammad Rastegari
ViT
MQ
26
252
0
06 Jun 2022
Which models are innately best at uncertainty estimation?
Ido Galil
Mohammed Dabbah
Ran El-Yaniv
UQCV
34
5
0
05 Jun 2022
CVNets: High Performance Library for Computer Vision
Sachin Mehta
Farzad Abdolhosseini
Mohammad Rastegari
29
18
0
04 Jun 2022
EfficientFormer: Vision Transformers at MobileNet Speed
Yanyu Li
Geng Yuan
Yang Wen
Eric Hu
Georgios Evangelidis
Sergey Tulyakov
Yanzhi Wang
Jian Ren
ViT
23
347
0
02 Jun 2022
Optimizing Relevance Maps of Vision Transformers Improves Robustness
Hila Chefer
Idan Schwartz
Lior Wolf
ViT
35
37
0
02 Jun 2022
CVM-Cervix: A Hybrid Cervical Pap-Smear Image Classification Framework Using CNN, Visual Transformer and Multilayer Perceptron
Wanli Liu
Chen Li
N. Xu
Tao Jiang
M. Rahaman
...
Weiming Hu
Hao Chen
Changhao Sun
Yudong Yao
M. Grzegorzek
9
132
0
02 Jun 2022
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Sehoon Kim
A. Gholami
Albert Eaton Shaw
Nicholas Lee
K. Mangalam
Jitendra Malik
Michael W. Mahoney
Kurt Keutzer
32
99
0
02 Jun 2022
XBound-Former: Toward Cross-scale Boundary Modeling in Transformers
Jiacheng Wang
Fei Chen
Yuxi Ma
Liansheng Wang
Zhaodong Fei
Jia Shuai
Xiangdong Tang
Qichao Zhou
Jing Qin
ViT
MedIm
27
63
0
02 Jun 2022
Vision GNN: An Image is Worth Graph of Nodes
Kai Han
Yunhe Wang
Jianyuan Guo
Yehui Tang
Enhua Wu
GNN
3DH
15
352
0
01 Jun 2022
Exact Feature Collisions in Neural Networks
Utku Ozbulak
Manvel Gasparyan
Shodhan Rao
W. D. Neve
Arnout Van Messem
AAML
24
1
0
31 May 2022
ViT-BEVSeg: A Hierarchical Transformer Network for Monocular Birds-Eye-View Segmentation
Pramit Dutta
Ganesh Sistu
S. Yogamani
E. López
J. McDonald
ViT
19
16
0
31 May 2022
Few-Shot Diffusion Models
Giorgio Giannone
Didrik Nielsen
Ole Winther
DiffM
183
49
0
30 May 2022
Self-Supervised Pre-training of Vision Transformers for Dense Prediction Tasks
Jaonary Rabarisoa
Velentin Belissen
Florian Chabot
Q. C. Pham
VLM
ViT
SSL
MDE
23
2
0
30 May 2022
GMML is All you Need
Sara Atito
Muhammad Awais
J. Kittler
ViT
VLM
46
18
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
64
26
0
30 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
51
22
0
28 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
67
41
0
28 May 2022
WaveMix: A Resource-efficient Neural Network for Image Analysis
Pranav Jeevan
Kavitha Viswanathan
S. AnanduA
A. Sethi
20
20
0
28 May 2022
Multi-Task Learning with Multi-Query Transformer for Dense Prediction
Yangyang Xu
Xiangtai Li
Haobo Yuan
Yibo Yang
Lefei Zhang
ViT
28
45
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
25
15
0
28 May 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
78
2,024
0
27 May 2022
Spartan: Differentiable Sparsity via Regularized Transportation
Kai Sheng Tai
Taipeng Tian
Ser-Nam Lim
31
11
0
27 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
34
48
0
27 May 2022
Sample-Efficient Optimisation with Probabilistic Transformer Surrogates
A. Maraval
Matthieu Zimmer
Antoine Grosnit
Rasul Tutunov
Jun Wang
H. Ammar
30
2
0
27 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
35
68
0
26 May 2022
Cross-Architecture Self-supervised Video Representation Learning
Sheng Guo
Zihua Xiong
Yujie Zhong
Limin Wang
Xiaobo Guo
Bing Han
Weilin Huang
SSL
AI4TS
76
24
0
26 May 2022
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
28
187
0
25 May 2022
Super Vision Transformer
Mingbao Lin
Yonghong Tian
Yuxin Zhang
Yunhang Shen
Rongrong Ji
Liujuan Cao
ViT
46
20
0
23 May 2022
Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation
Ji-Hoon Bae
Sungho Moon
Sunghoon Im
MDE
33
84
0
23 May 2022
Dynamic Query Selection for Fast Visual Perceiver
Corentin Dancette
Matthieu Cord
36
1
0
22 May 2022
Vision Transformers in 2022: An Update on Tiny ImageNet
Ethan Huynh
ViT
31
11
0
21 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
27
1
0
21 May 2022
Learning to Count Anything: Reference-less Class-agnostic Counting with Weak Supervision
Michael A. Hobley
V. Prisacariu
47
36
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
30
51
0
19 May 2022
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
Feng Liu
Xiaosong Zhang
Zhiliang Peng
Zonghao Guo
Fang Wan
Xian-Wei Ji
QiXiang Ye
ObjD
43
20
0
19 May 2022
Cross-Enhancement Transformer for Action Segmentation
Jiahui Wang
Zhenyou Wang
Shanna Zhuang
Hui Wang
ViT
54
23
0
19 May 2022
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
39
62
0
17 May 2022
Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Luke Melas-Kyriazi
Christian Rupprecht
Iro Laina
Andrea Vedaldi
30
159
0
16 May 2022
Diffusion Models for Adversarial Purification
Weili Nie
Brandon Guo
Yujia Huang
Chaowei Xiao
Arash Vahdat
Anima Anandkumar
WIGM
218
418
0
16 May 2022
Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions
Fang Wu
Siyuan Li
Lirong Wu
Dragomir R. Radev
Stan Z. Li
27
2
0
15 May 2022
Dense residual Transformer for image denoising
Chao Yao
Shuo Jin
Meiqin Liu
Xiaojuan Ban
ViT
36
29
0
14 May 2022
Simple Open-Vocabulary Object Detection with Vision Transformers
Matthias Minderer
A. Gritsenko
Austin Stone
Maxim Neumann
Dirk Weissenborn
...
Zhuoran Shen
Tianlin Li
Xiaohua Zhai
Thomas Kipf
N. Houlsby
ObjD
CLIP
VLM
ViT
OCL
34
307
0
12 May 2022
Previous
1
2
3
...
15
16
17
...
24
25
26
Next