ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.12877
  4. Cited By
Training data-efficient image transformers & distillation through
  attention

Training data-efficient image transformers & distillation through attention

23 December 2020
Hugo Touvron
Matthieu Cord
Matthijs Douze
Francisco Massa
Alexandre Sablayrolles
Hervé Jégou
    ViT
ArXivPDFHTML

Papers citing "Training data-efficient image transformers & distillation through attention"

50 / 1,234 papers shown
Title
Temporal Perceiver: A General Architecture for Arbitrary Boundary
  Detection
Temporal Perceiver: A General Architecture for Arbitrary Boundary Detection
Jing Tan
Yuhong Wang
Gangshan Wu
Limin Wang
43
14
0
01 Mar 2022
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Joya Chen
Kai Xu
Yuhui Wang
Yifei Cheng
Angela Yao
19
7
0
28 Feb 2022
CTformer: Convolution-free Token2Token Dilated Vision Transformer for
  Low-dose CT Denoising
CTformer: Convolution-free Token2Token Dilated Vision Transformer for Low-dose CT Denoising
Dayang Wang
Fenglei Fan
Zhan Wu
R. Liu
Fei-Yue Wang
Hengyong Yu
ViT
MedIm
32
121
0
28 Feb 2022
Learn From the Past: Experience Ensemble Knowledge Distillation
Learn From the Past: Experience Ensemble Knowledge Distillation
Chaofei Wang
Shaowei Zhang
S. Song
Gao Huang
27
4
0
25 Feb 2022
Delving Deep into One-Shot Skeleton-based Action Recognition with
  Diverse Occlusions
Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions
Kunyu Peng
Alina Roitberg
Kailun Yang
Jiaming Zhang
Rainer Stiefelhagen
ViT
21
28
0
23 Feb 2022
GroupViT: Semantic Segmentation Emerges from Text Supervision
GroupViT: Semantic Segmentation Emerges from Text Supervision
Jiarui Xu
Shalini De Mello
Sifei Liu
Wonmin Byeon
Thomas Breuel
Jan Kautz
Xinyu Wang
ViT
VLM
189
499
0
22 Feb 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
33
27
0
21 Feb 2022
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
24
637
0
20 Feb 2022
VLP: A Survey on Vision-Language Pre-training
VLP: A Survey on Vision-Language Pre-training
Feilong Chen
Duzhen Zhang
Minglun Han
Xiuyi Chen
Jing Shi
Shuang Xu
Bo Xu
VLM
82
213
0
18 Feb 2022
Graph Masked Autoencoders with Transformers
Graph Masked Autoencoders with Transformers
Sixiao Zhang
Hongxu Chen
Haoran Yang
Xiangguo Sun
Philip S. Yu
Guandong Xu
21
18
0
17 Feb 2022
Vision Models Are More Robust And Fair When Pretrained On Uncurated
  Images Without Supervision
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Priya Goyal
Quentin Duval
Isaac Seessel
Mathilde Caron
Ishan Misra
Levent Sagun
Armand Joulin
Piotr Bojanowski
VLM
SSL
26
110
0
16 Feb 2022
Meta Knowledge Distillation
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
18
25
0
16 Feb 2022
ActionFormer: Localizing Moments of Actions with Transformers
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
28
329
0
16 Feb 2022
Rethinking Network Design and Local Geometry in Point Cloud: A Simple
  Residual MLP Framework
Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework
Xu Ma
Can Qin
Haoxuan You
Haoxi Ran
Y. Fu
3DPC
19
583
0
15 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
27
34
0
14 Feb 2022
How Do Vision Transformers Work?
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
35
465
0
14 Feb 2022
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision
  MLPs
Mixing and Shifting: Exploiting Global and Local Dependencies in Vision MLPs
Huangjie Zheng
Pengcheng He
Weizhu Chen
Mingyuan Zhou
22
14
0
14 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
14
90
0
13 Feb 2022
How to Understand Masked Autoencoders
How to Understand Masked Autoencoders
Shuhao Cao
Peng-Tao Xu
David A. Clifton
29
40
0
08 Feb 2022
LwPosr: Lightweight Efficient Fine-Grained Head Pose Estimation
LwPosr: Lightweight Efficient Fine-Grained Head Pose Estimation
Naina Dhingra
21
16
0
07 Feb 2022
Transformers in Self-Supervised Monocular Depth Estimation with Unknown
  Camera Intrinsics
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics
Arnav Varma
Hemang Chawla
Bahram Zonooz
Elahe Arani
ViT
MDE
36
49
0
07 Feb 2022
Learning Features with Parameter-Free Layers
Learning Features with Parameter-Free Layers
Dongyoon Han
Y. Yoo
Beomyoung Kim
Byeongho Heo
35
8
0
06 Feb 2022
A Note on "Assessing Generalization of SGD via Disagreement"
A Note on "Assessing Generalization of SGD via Disagreement"
Andreas Kirsch
Y. Gal
FedML
UQCV
23
15
0
03 Feb 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
121
264
0
02 Feb 2022
Query Efficient Decision Based Sparse Attacks Against Black-Box Deep
  Learning Models
Query Efficient Decision Based Sparse Attacks Against Black-Box Deep Learning Models
Viet Vo
Ehsan Abbasnejad
D. Ranasinghe
AAML
27
14
0
31 Jan 2022
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data
  Augmentations
Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations
Amin Ghiasi
Hamid Kazemi
Steven Reich
Chen Zhu
Micah Goldblum
Tom Goldstein
45
15
0
31 Jan 2022
Aggregating Global Features into Local Vision Transformer
Aggregating Global Features into Local Vision Transformer
Krushi Patel
A. Bur
Fengju Li
Guanghui Wang
ViT
33
34
0
30 Jan 2022
O-ViT: Orthogonal Vision Transformer
O-ViT: Orthogonal Vision Transformer
Yanhong Fei
Yingjie Liu
Xian Wei
Mingsong Chen
ViT
13
8
0
28 Jan 2022
BLIP: Bootstrapping Language-Image Pre-training for Unified
  Vision-Language Understanding and Generation
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
392
4,137
0
28 Jan 2022
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
DynaMixer: A Vision MLP Architecture with Dynamic Mixing
Ziyu Wang
Wenhao Jiang
Yiming Zhu
Li Yuan
Yibing Song
Wei Liu
43
44
0
28 Jan 2022
Vision Checklist: Towards Testable Error Analysis of Image Models to
  Help System Designers Interrogate Model Capabilities
Vision Checklist: Towards Testable Error Analysis of Image Models to Help System Designers Interrogate Model Capabilities
Xin Du
Bénédicte Legastelois
B. Ganesh
A. Rajan
Hana Chockler
Vaishak Belle
Stuart Anderson
S. Ramamoorthy
AAML
24
6
0
27 Jan 2022
Joint Liver and Hepatic Lesion Segmentation in MRI using a Hybrid CNN
  with Transformer Layers
Joint Liver and Hepatic Lesion Segmentation in MRI using a Hybrid CNN with Transformer Layers
Georg Hille
Shubham Agrawal
Pavan Tummala
C. Wybranski
M. Pech
A. Surov
S. Saalfeld
ViT
MedIm
19
26
0
26 Jan 2022
One Student Knows All Experts Know: From Sparse to Dense
One Student Knows All Experts Know: From Sparse to Dense
Fuzhao Xue
Xiaoxin He
Xiaozhe Ren
Yuxuan Lou
Yang You
MoMe
MoE
27
20
0
26 Jan 2022
UniFormer: Unifying Convolution and Self-attention for Visual
  Recognition
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
Kunchang Li
Yali Wang
Junhao Zhang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
153
360
0
24 Jan 2022
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Aaron Nicolson
Jason Dowling
Bevan Koopman
ViT
LM&MA
MedIm
30
90
0
24 Jan 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit
  Vision Transformer
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Yi Ding
Tianlong Chen
Xiaolong Ma
Zhangyang Wang
Yanzhi Wang
ViT
33
45
0
17 Jan 2022
Video Transformers: A Survey
Video Transformers: A Survey
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
22
103
0
16 Jan 2022
UniFormer: Unified Transformer for Efficient Spatiotemporal
  Representation Learning
UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning
Kunchang Li
Yali Wang
Peng Gao
Guanglu Song
Yu Liu
Hongsheng Li
Yu Qiao
ViT
38
238
0
12 Jan 2022
Multiview Transformers for Video Recognition
Multiview Transformers for Video Recognition
Shen Yan
Xuehan Xiong
Anurag Arnab
Zhichao Lu
Mi Zhang
Chen Sun
Cordelia Schmid
ViT
26
212
0
12 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
42
4,980
0
10 Jan 2022
QuadTree Attention for Vision Transformers
QuadTree Attention for Vision Transformers
Shitao Tang
Jiahui Zhang
Siyu Zhu
Ping Tan
ViT
166
156
0
08 Jan 2022
Short Range Correlation Transformer for Occluded Person
  Re-Identification
Short Range Correlation Transformer for Occluded Person Re-Identification
Yunbin Zhao
Song-Chun Zhu
Dongsheng Wang
Zhiwei Liang
ViT
15
22
0
04 Jan 2022
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid
  Architecture
PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture
Kai Han
Jianyuan Guo
Yehui Tang
Yunhe Wang
ViT
31
22
0
04 Jan 2022
Multi-Dimensional Model Compression of Vision Transformer
Multi-Dimensional Model Compression of Vision Transformer
Zejiang Hou
S. Kung
ViT
25
16
0
31 Dec 2021
Background-aware Classification Activation Map for Weakly Supervised
  Object Localization
Background-aware Classification Activation Map for Weakly Supervised Object Localization
Lei Zhu
Qi She
Qian Chen
Xiangxi Meng
Mufeng Geng
...
Bin Qiu
Yunfei You
Yibao Zhang
Qiushi Ren
Yanye Lu
WSOL
50
18
0
29 Dec 2021
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped
  Attention
Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Sitong Wu
Tianyi Wu
Hao Hao Tan
G. Guo
ViT
31
70
0
28 Dec 2021
Augmenting Convolutional networks with attention-based aggregation
Augmenting Convolutional networks with attention-based aggregation
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Piotr Bojanowski
Armand Joulin
Gabriel Synnaeve
Hervé Jégou
ViT
38
47
0
27 Dec 2021
MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of
  Pancreatic Cancer
MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer
Tianyi Zhang
Yunlu Feng
Yu Zhao
Guangda Fan
Aiming Yang
...
Fan Song
Chenbin Ma
Yangyang Sun
Youdan Feng
Guanglei Zhang
ViT
MedIm
14
10
0
27 Dec 2021
Vision Transformer for Small-Size Datasets
Vision Transformer for Small-Size Datasets
Seung Hoon Lee
Seunghyun Lee
B. Song
ViT
19
222
0
27 Dec 2021
Raw Produce Quality Detection with Shifted Window Self-Attention
Raw Produce Quality Detection with Shifted Window Self-Attention
Oh Joon Kwon
Byungsoo Kim
Youngduck Choi
ViT
22
0
0
24 Dec 2021
Previous
123...181920...232425
Next