ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.11986
  4. Cited By
Tokens-to-Token ViT: Training Vision Transformers from Scratch on
  ImageNet

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet

28 January 2021
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
    ViT
ArXivPDFHTML

Papers citing "Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet"

50 / 408 papers shown
Title
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and
  Favorable Transferability For ViTs
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
J. Han
Guiguang Ding
ViT
34
6
0
27 Sep 2023
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient
  Channels
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao
Pichao Wang
Yuyang Zhao
Hao Luo
F. Wang
Mike Zheng Shou
ViT
37
14
0
15 Sep 2023
Interpretability-Aware Vision Transformer
Interpretability-Aware Vision Transformer
Yao Qiang
Chengyin Li
Prashant Khanduri
D. Zhu
ViT
82
7
0
14 Sep 2023
SwinFace: A Multi-task Transformer for Face Recognition, Expression
  Recognition, Age Estimation and Attribute Estimation
SwinFace: A Multi-task Transformer for Face Recognition, Expression Recognition, Age Estimation and Attribute Estimation
Lixiong Qin
Mei Wang
Chao Deng
K. Wang
Xiangshan Chen
Jiani Hu
Weihong Deng
CVBM
ViT
37
38
0
22 Aug 2023
MGMAE: Motion Guided Masking for Video Masked Autoencoding
MGMAE: Motion Guided Masking for Video Masked Autoencoding
Bingkun Huang
Zhiyu Zhao
Guozhen Zhang
Yu Qiao
Limin Wang
39
30
0
21 Aug 2023
PVG: Progressive Vision Graph for Vision Recognition
PVG: Progressive Vision Graph for Vision Recognition
Jiafu Wu
Jian Li
Jiangning Zhang
Boshen Zhang
M. Chi
Yabiao Wang
Chengjie Wang
ViT
25
12
0
01 Aug 2023
A survey on deep learning in medical image registration: new
  technologies, uncertainty, evaluation metrics, and beyond
A survey on deep learning in medical image registration: new technologies, uncertainty, evaluation metrics, and beyond
Junyu Chen
Yihao Liu
Shuwen Wei
Zhangxing Bian
Shalini Subramanian
A. Carass
Jerry L. Prince
Yong Du
OOD
45
36
0
28 Jul 2023
Set-level Guidance Attack: Boosting Adversarial Transferability of
  Vision-Language Pre-training Models
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models
Dong Lu
Zhiqiang Wang
Teng Wang
Weili Guan
Hongchang Gao
Feng Zheng
AAML
53
65
0
26 Jul 2023
PatchCT: Aligning Patch Set and Label Set with Conditional Transport for
  Multi-Label Image Classification
PatchCT: Aligning Patch Set and Label Set with Conditional Transport for Multi-Label Image Classification
Miaoge Li
Dongsheng Wang
Xinyang Liu
Zequn Zeng
Ruiying Lu
Bo Chen
Mingyuan Zhou
VLM
OT
25
15
0
18 Jul 2023
Spike-driven Transformer
Spike-driven Transformer
Man Yao
Jiakui Hu
Zhaokun Zhou
Liuliang Yuan
Yonghong Tian
Boxing Xu
Guoqi Li
34
114
0
04 Jul 2023
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free
  Inference
Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference
Boyan Li
Luziwei Leng
Shuaijie Shen
Kaixuan Zhang
Jianguo Zhang
Jianxing Liao
Ran Cheng
31
7
0
21 Jun 2023
Auto-Spikformer: Spikformer Architecture Search
Auto-Spikformer: Spikformer Architecture Search
Kaiwei Che
Zhaokun Zhou
Zhengyu Ma
Wei Fang
Yanqing Chen
Shuaijie Shen
Liuliang Yuan
Yonghong Tian
29
4
0
01 Jun 2023
Lightweight Vision Transformer with Bidirectional Interaction
Lightweight Vision Transformer with Bidirectional Interaction
Qihang Fan
Huaibo Huang
Xiaoqiang Zhou
Ran He
ViT
50
28
0
01 Jun 2023
Vision Transformers for Mobile Applications: A Short Survey
Vision Transformers for Mobile Applications: A Short Survey
Nahid Alam
Steven Kolawole
S. Sethi
Nishant Bansali
Karina Nguyen
ViT
31
3
0
30 May 2023
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
  Graph in Pre-Trained Transformers
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
Hongjie Wang
Bhishma Dedhia
N. Jha
ViT
VLM
44
26
0
27 May 2023
How Deep Learning Sees the World: A Survey on Adversarial Attacks &
  Defenses
How Deep Learning Sees the World: A Survey on Adversarial Attacks & Defenses
Joana Cabral Costa
Tiago Roxo
Hugo Manuel Proença
Pedro R. M. Inácio
AAML
40
50
0
18 May 2023
Enhancing the Performance of Transformer-based Spiking Neural Networks
  by SNN-optimized Downsampling with Precise Gradient Backpropagation
Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation
Chenlin Zhou
Han Zhang
Zhaokun Zhou
Liutao Yu
Zhengyu Ma
Huihui Zhou
Xiaopeng Fan
Yonghong Tian
26
9
0
10 May 2023
Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method
  based on Fast Fourier Convolution and ConvNeXt
Breaking Through the Haze: An Advanced Non-Homogeneous Dehazing Method based on Fast Fourier Convolution and ConvNeXt
Han Zhou
Weida Dong
Yangyi Liu
Jun Chen
43
18
0
08 May 2023
MTLSegFormer: Multi-task Learning with Transformers for Semantic
  Segmentation in Precision Agriculture
MTLSegFormer: Multi-task Learning with Transformers for Semantic Segmentation in Precision Agriculture
D. Gonçalves
J. M. Junior
Pedro Zamboni
H. Pistori
Jonathan Li
Keiller Nogueira
W. Gonçalves
40
5
0
04 May 2023
FR-Net:A Light-weight FFT Residual Net For Gaze Estimation
FR-Net:A Light-weight FFT Residual Net For Gaze Estimation
Tao Xu
Borimandafu Wu
Ruilong Fan
Yun Zhou
Di Huang
32
2
0
04 May 2023
Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold
  Learning
Rank Flow Embedding for Unsupervised and Semi-Supervised Manifold Learning
L. P. Valem
Daniel Carlos Guimarães Pedronette
Longin Jan Latecki
21
5
0
24 Apr 2023
TransFlow: Transformer as Flow Learner
TransFlow: Transformer as Flow Learner
Yawen Lu
Qifan Wang
Siqi Ma
Tong Geng
Victor Y. Chen
Huaijin Chen
Dongfang Liu
ViT
35
45
0
23 Apr 2023
Feature-compatible Progressive Learning for Video Copy Detection
Feature-compatible Progressive Learning for Video Copy Detection
Wenhao Wang
Yifan Sun
Yi Yang
22
3
0
20 Apr 2023
Permutation Equivariance of Transformers and Its Applications
Permutation Equivariance of Transformers and Its Applications
Hengyuan Xu
Liyao Xiang
Hang Ye
Dixi Yao
Pengzhi Chu
Baochun Li
19
13
0
16 Apr 2023
Distilling Token-Pruned Pose Transformer for 2D Human Pose Estimation
Distilling Token-Pruned Pose Transformer for 2D Human Pose Estimation
Feixiang Ren
ViT
19
2
0
12 Apr 2023
ConvFormer: Parameter Reduction in Transformer Models for 3D Human Pose
  Estimation by Leveraging Dynamic Multi-Headed Convolutional Attention
ConvFormer: Parameter Reduction in Transformer Models for 3D Human Pose Estimation by Leveraging Dynamic Multi-Headed Convolutional Attention
Alec Diaz-Arias
Dmitriy Shin
ViT
18
10
0
04 Apr 2023
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for
  Action Segmentation
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for Action Segmentation
Peiyao Wang
Haibin Ling
15
2
0
04 Apr 2023
SVT: Supertoken Video Transformer for Efficient Video Understanding
SVT: Supertoken Video Transformer for Efficient Video Understanding
Chen-Ming Pan
Rui Hou
Hanchao Yu
Qifan Wang
Senem Velipasalar
Madian Khabsa
ViT
26
0
0
01 Apr 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution
  Vision Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
29
46
0
30 Mar 2023
InceptionNeXt: When Inception Meets ConvNeXt
InceptionNeXt: When Inception Meets ConvNeXt
Weihao Yu
Pan Zhou
Shuicheng Yan
Xinchao Wang
48
119
0
29 Mar 2023
Vision Transformer with Quadrangle Attention
Vision Transformer with Quadrangle Attention
Qiming Zhang
Jing Zhang
Yufei Xu
Dacheng Tao
ViT
24
38
0
27 Mar 2023
Global-to-Local Modeling for Video-based 3D Human Pose and Shape
  Estimation
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation
Xi Shen
Zongxin Yang
Xiaohan Wang
Jianxin Ma
Chang Zhou
Yezhou Yang
ViT
3DH
26
34
0
26 Mar 2023
One-to-Few Label Assignment for End-to-End Dense Detection
One-to-Few Label Assignment for End-to-End Dense Detection
Shuai Li
Minghan Li
Ruihuang Li
Chenhang He
Lei Zhang
33
19
0
21 Mar 2023
Towards Diverse Binary Segmentation via A Simple yet General Gated
  Network
Towards Diverse Binary Segmentation via A Simple yet General Gated Network
Xiaoqi Zhao
Youwei Pang
Lihe Zhang
Huchuan Lu
Lei Zhang
28
14
0
18 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not
  Attention
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
25
24
0
02 Mar 2023
A Comprehensive Study on Robustness of Image Classification Models:
  Benchmarking and Rethinking
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking
Chang-Shu Liu
Yinpeng Dong
Wenzhao Xiang
X. Yang
Hang Su
Junyi Zhu
YueFeng Chen
Yuan He
H. Xue
Shibao Zheng
OOD
VLM
AAML
33
72
0
28 Feb 2023
Device Tuning for Multi-Task Large Model
Device Tuning for Multi-Task Large Model
Penghao Jiang
Xuanchen Hou
Y. Zhou
26
0
0
21 Feb 2023
LIT-Former: Linking In-plane and Through-plane Transformers for
  Simultaneous CT Image Denoising and Deblurring
LIT-Former: Linking In-plane and Through-plane Transformers for Simultaneous CT Image Denoising and Deblurring
Zhihao Chen
Chuang Niu
Qi Gao
Ge Wang
Hongming Shan
MedIm
ViT
3DV
36
20
0
21 Feb 2023
Soft Error Reliability Analysis of Vision Transformers
Soft Error Reliability Analysis of Vision Transformers
Xing-xiong Xue
Cheng Liu
Ying Wang
Bing Yang
Tao Luo
Lefei Zhang
Huawei Li
Xiaowei Li
39
14
0
21 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image
  Classification
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
34
177
0
19 Feb 2023
Hyneter: Hybrid Network Transformer for Object Detection
Hyneter: Hybrid Network Transformer for Object Detection
Dong Chen
Duoqian Miao
Xuepeng Zhao
ViT
31
3
0
18 Feb 2023
Transformadores: Fundamentos teoricos y Aplicaciones
Transformadores: Fundamentos teoricos y Aplicaciones
J. D. L. Torre
78
0
0
18 Feb 2023
Transformer-based Generative Adversarial Networks in Computer Vision: A
  Comprehensive Survey
Transformer-based Generative Adversarial Networks in Computer Vision: A Comprehensive Survey
S. Dubey
Satish Kumar Singh
ViT
44
33
0
17 Feb 2023
Efficiency 360: Efficient Vision Transformers
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
26
6
0
16 Feb 2023
Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection
Hierarchical Cross-modal Transformer for RGB-D Salient Object Detection
Hao Chen
Feihong Shen
ViT
36
0
0
16 Feb 2023
Robust Representation Learning with Self-Distillation for Domain
  Generalization
Robust Representation Learning with Self-Distillation for Domain Generalization
Ankur Singh
Senthilnath Jayavelu
ViT
OOD
18
2
0
14 Feb 2023
A Systematic Evaluation and Benchmark for Embedding-Aware Generative
  Models: Features, Models, and Any-shot Scenarios
A Systematic Evaluation and Benchmark for Embedding-Aware Generative Models: Features, Models, and Any-shot Scenarios
Liangjun Feng
Jiancheng Zhao
Chunhui Zhao
VLM
32
0
0
08 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
AIM: Adapting Image Models for Efficient Video Action Recognition
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
Chong Chen
Mu Li
ViT
58
144
0
06 Feb 2023
Learning a Fourier Transform for Linear Relative Positional Encodings in
  Transformers
Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
K. Choromanski
Shanda Li
Valerii Likhosherstov
Kumar Avinava Dubey
Shengjie Luo
Di He
Yiming Yang
Tamás Sarlós
Thomas Weingarten
Adrian Weller
37
8
0
03 Feb 2023
Inference Time Evidences of Adversarial Attacks for Forensic on
  Transformers
Inference Time Evidences of Adversarial Attacks for Forensic on Transformers
Hugo Lemarchant
Liang Li
Yiming Qian
Yuta Nakashima
Hajime Nagahara
ViT
AAML
43
0
0
31 Jan 2023
Previous
123456789
Next