Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.11986
Cited By
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet
28 January 2021
Li-xin Yuan
Yunpeng Chen
Tao Wang
Weihao Yu
Yujun Shi
Zihang Jiang
Francis E. H. Tay
Jiashi Feng
Shuicheng Yan
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet"
50 / 395 papers shown
Title
DRRNet: Macro-Micro Feature Fusion and Dual Reverse Refinement for Camouflaged Object Detection
Jianlin Sun
Xiaolin Fang
Juwei Guan
Dongdong Gui
Teqi Wang
Tongxin Zhu
30
0
0
14 May 2025
A 2D Semantic-Aware Position Encoding for Vision Transformers
Xi Chen
Shiyang Zhou
Muqi Huang
Jiaxu Feng
Yun Xiong
...
Yuyao Zhang
Huishuai Bao
Sijia Peng
Chong Li
Feng Shi
ViT
31
0
0
14 May 2025
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective
Songsong Duan
Xi Yang
Nannan Wang
Xinbo Gao
55
0
0
07 May 2025
Image Recognition with Online Lightweight Vision Transformer: A Survey
Zherui Zhang
Rongtao Xu
Jie Zhou
Changwei Wang
Xingtian Pei
...
Jiguang Zhang
Li Guo
Longxiang Gao
Wenyuan Xu
Shibiao Xu
ViT
148
0
0
06 May 2025
Vision Transformers in Precision Agriculture: A Comprehensive Survey
Saber Mehdipour
Seyed Abolghasem Mirroshandel
Seyed Amirhossein Tabatabaei
39
0
0
30 Apr 2025
Optimal Hyperspectral Undersampling Strategy for Satellite Imaging
Vita V. Vlasova
Vladimir G. Kuzmin
Maria S. Varetsa
Natalia A. Ibragimova
Oleg Y. Rogov
Elena V. Lyapuntsova
21
0
0
27 Apr 2025
TimeCapsule: Solving the Jigsaw Puzzle of Long-Term Time Series Forecasting with Compressed Predictive Representations
Yihang Lu
Yangyang Xu
Qitao Qing
Xianwei Meng
AI4TS
49
0
0
17 Apr 2025
Embedding Radiomics into Vision Transformers for Multimodal Medical Image Classification
Zhenyu Yang
Haiming Zhu
Rihui Zhang
Haipeng Zhang
Jianliang Wang
Chunhao Wang
Minbin Chen
F. Yin
MedIm
38
0
0
15 Apr 2025
GFT: Gradient Focal Transformer
Boris Kriuk
Simranjit Kaur Gill
Shoaib Aslam
Amir Fakhrutdinov
31
0
0
14 Apr 2025
HGFormer: Topology-Aware Vision Transformer with HyperGraph Learning
Hao Wang
Shuo Zhang
Biao Leng
ViT
82
0
0
03 Apr 2025
SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition
Zeqi Zheng
Yanchen Huang
Yingchao Yu
Zizheng Zhu
Junfeng Tang
Zhaofei Yu
Yaochu Jin
39
0
0
20 Mar 2025
Semi-Supervised 360 Layout Estimation with Panoramic Collaborative Perturbations
Junsong Zhang
Chunyu Lin
Zhijie Shen
Lang Nie
K. Liao
Yao Zhao
35
0
0
03 Mar 2025
VRM: Knowledge Distillation via Virtual Relation Matching
W. Zhang
Fei Xie
Weidong Cai
Chao Ma
76
0
0
28 Feb 2025
Low-Rank Thinning
Annabelle Michael Carrell
Albert Gong
Abhishek Shetty
Raaz Dwivedi
Lester W. Mackey
61
0
0
17 Feb 2025
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng
Yadan Luo
Xin Li
D. Jiang
Zheng Zhang
159
0
0
25 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Liang Feng
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
152
0
0
21 Jan 2025
VMamba: Visual State Space Model
Yue Liu
Yunjie Tian
Yuzhong Zhao
Hongtian Yu
Lingxi Xie
Yaowei Wang
Qixiang Ye
Jianbin Jiao
Yunfan Liu
Mamba
152
612
0
31 Dec 2024
CMAL: A Novel Cross-Modal Associative Learning Framework for Vision-Language Pre-Training
Zhiyuan Ma
Jianjun Li
Guohui Li
Kaiyan Huang
VLM
56
9
0
16 Oct 2024
MoH: Multi-Head Attention as Mixture-of-Head Attention
Peng Jin
Bo Zhu
Li Yuan
Shuicheng Yan
MoE
31
13
0
15 Oct 2024
FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
Yuqi Jiang
Xudong Lu
Qian Jin
Qi Sun
Hanming Wu
Cheng Zhuo
36
5
0
15 Jul 2024
Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking
You Wu
Xucheng Wang
Dan Zeng
Hengzhou Ye
Xiaolan Xie
Qijun Zhao
Shuiwang Li
37
3
0
07 Jul 2024
Improving robustness to corruptions with multiplicative weight perturbations
Trung Trinh
Markus Heinonen
Luigi Acerbi
Samuel Kaski
44
0
0
24 Jun 2024
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE
Florence Regol
Joud Chataoui
Bertrand Charpentier
Mark J. Coates
Pablo Piantanida
Stephan Gunnemann
45
0
0
20 Jun 2024
Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking
Xiangyang Yang
Dan Zeng
Xucheng Wang
You Wu
Hengzhou Ye
Qijun Zhao
Shuiwang Li
59
3
0
12 Jun 2024
A DeNoising FPN With Transformer R-CNN for Tiny Object Detection
Hou-I Liu
Yu-Wen Tseng
Kai-Cheng Chang
Pin-Jyun Wang
Hong-Han Shuai
Wen-Huang Cheng
ViT
ObjD
42
24
0
09 Jun 2024
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Moein Heidari
Reza Azad
Sina Ghorbani Kolahi
René Arimond
Leon Niggemeier
...
Afshin Bozorgpour
Ehsan Khodapanah Aghdam
A. Kazerouni
I. Hacihaliloglu
Dorit Merhof
51
7
0
28 Mar 2024
Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory
Sensen Gao
Xiaojun Jia
Xuhong Ren
Ivor Tsang
Qing-Wu Guo
AAML
38
14
0
19 Mar 2024
HIRI-ViT: Scaling Vision Transformer with High Resolution Inputs
Ting Yao
Yehao Li
Yingwei Pan
Tao Mei
ViT
31
15
0
18 Mar 2024
Segmentation Guided Sparse Transformer for Under-Display Camera Image Restoration
Jingyun Xue
Tao Wang
Jun Wang
Kaihao Zhang
ViT
51
2
0
09 Mar 2024
LUM-ViT: Learnable Under-sampling Mask Vision Transformer for Bandwidth Limited Optical Signal Acquisition
Lingfeng Liu
Dong Ni
Hangjie Yuan
ViT
35
0
0
03 Mar 2024
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
Mahdi Karami
Ali Ghodsi
VLM
48
6
0
28 Feb 2024
FViT: A Focal Vision Transformer with Gabor Filter
Yulong Shi
Mingwei Sun
Yongshuai Wang
Rui Wang
57
4
0
17 Feb 2024
Learning Low-Rank Feature for Thorax Disease Classification
Rajeev Goel
Utkarsh Nath
Yancheng Wang
Alvin C. Silva
Teresa Wu
Yingzhen Yang
22
0
0
14 Feb 2024
DeSparsify: Adversarial Attack Against Token Sparsification Mechanisms in Vision Transformers
Oryan Yehezkel
Alon Zolfi
Amit Baras
Yuval Elovici
A. Shabtai
AAML
32
0
0
04 Feb 2024
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Yi Xin
Jianjiang Yang
Haodi Zhou
Junlong Du
Junlong Du
Yue Fan
Qing Li
Qing Li
Yuntao Du
VLM
75
75
0
03 Feb 2024
CascadedGaze: Efficiency in Global Context Extraction for Image Restoration
Amirhosein Ghasemabadi
Muhammad Kamran Janjua
Mohammad Salameh
Chunhua Zhou
Fengyu Sun
Di Niu
35
11
0
26 Jan 2024
Setting the Record Straight on Transformer Oversmoothing
G. Dovonon
M. Bronstein
Matt J. Kusner
28
5
0
09 Jan 2024
360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception
Zhijie Shen
Chunyu Lin
Junsong Zhang
Lang Nie
K. Liao
Yao Zhao
28
5
0
26 Dec 2023
A Survey on Open-Set Image Recognition
Jiaying Sun
Qiulei Dong
BDL
ObjD
32
3
0
25 Dec 2023
Video Recognition in Portrait Mode
Mingfei Han
Linjie Yang
Xiaojie Jin
Jiashi Feng
Xiaojun Chang
Heng Wang
30
3
0
21 Dec 2023
Cached Transformers: Improving Transformers with Differentiable Memory Cache
Zhaoyang Zhang
Wenqi Shao
Yixiao Ge
Xiaogang Wang
Liang Feng
Ping Luo
16
2
0
20 Dec 2023
Graph Convolutions Enrich the Self-Attention in Transformers!
Jeongwhan Choi
Hyowon Wi
Jayoung Kim
Yehjin Shin
Kookjin Lee
Nathaniel Trask
Noseong Park
32
4
0
07 Dec 2023
SCHEME: Scalable Channel Mixer for Vision Transformers
Deepak Sridhar
Yunsheng Li
Nuno Vasconcelos
47
0
0
01 Dec 2023
QuadraNet: Improving High-Order Neural Interaction Efficiency with Hardware-Aware Quadratic Neural Networks
Chenhui Xu
Fuxun Yu
Zirui Xu
Chenchen Liu
Jinjun Xiong
Xiang Chen
33
4
0
29 Nov 2023
Improved TokenPose with Sparsity
Anning Li
ViT
34
0
0
16 Nov 2023
Rotation Invariant Transformer for Recognizing Object in UAVs
Shuo Chen
Mang Ye
Bo Du
ViT
32
18
0
05 Nov 2023
Improving Robustness for Vision Transformer with a Simple Dynamic Scanning Augmentation
Shashank Kotyan
Danilo Vasconcellos Vargas
ViT
27
2
0
01 Nov 2023
Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers
Yuanduo Hong
Jue Wang
Weichao Sun
Huihui Pan
VLM
ViT
37
7
0
19 Oct 2023
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
Yulong Shi
Mingwei Sun
Yongshuai Wang
Hui Sun
Zengqiang Chen
34
4
0
10 Oct 2023
Low-Resolution Self-Attention for Semantic Segmentation
Yu-Huan Wu
Shi-Chen Zhang
Yun-Hai Liu
Le Zhang
Xin Zhan
Daquan Zhou
Jiashi Feng
Ming-Ming Cheng
Liangli Zhen
ViT
45
3
0
08 Oct 2023
1
2
3
4
5
6
7
8
Next