ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.02034
  4. Cited By
DynamicViT: Efficient Vision Transformers with Dynamic Token
  Sparsification
v1v2 (latest)

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

3 June 2021
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
    ViT
ArXiv (abs)PDFHTMLGithub (608★)

Papers citing "DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification"

50 / 444 papers shown
Title
Rethinking Local Perception in Lightweight Vision Transformer
Rethinking Local Perception in Lightweight Vision Transformer
Qi Fan
Huaibo Huang
Jiyang Guan
Ran He
ViT
51
31
0
31 Mar 2023
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution
  Vision Transformer
SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Xuanyao Chen
Zhijian Liu
Haotian Tang
Li Yi
Hang Zhao
Song Han
ViT
205
48
0
30 Mar 2023
Token Merging for Fast Stable Diffusion
Token Merging for Fast Stable Diffusion
Daniel Bolya
Judy Hoffman
87
112
0
30 Mar 2023
Masked and Adaptive Transformer for Exemplar Based Image Translation
Masked and Adaptive Transformer for Exemplar Based Image Translation
Changlong Jiang
Fei Gao
Biao Ma
Yuhao Lin
N. Wang
Gang Xu
79
18
0
30 Mar 2023
Generalized Relation Modeling for Transformer Tracking
Generalized Relation Modeling for Transformer Tracking
Shenyuan Gao
Chunluan Zhou
Jun Zhang
ViT
64
113
0
29 Mar 2023
SwiftFormer: Efficient Additive Attention for Transformer-based
  Real-time Mobile Vision Applications
SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications
Abdelrahman M. Shaker
Muhammad Maaz
H. Rasheed
Salman Khan
Ming-Hsuan Yang
Fahad Shahbaz Khan
ViT
152
96
0
27 Mar 2023
Selective Structured State-Spaces for Long-Form Video Understanding
Selective Structured State-Spaces for Long-Form Video Understanding
Jue Wang
Wenjie Zhu
Pichao Wang
Xiang Yu
Linda Liu
Mohamed Omar
Raffay Hamid
89
100
0
25 Mar 2023
Video-Text as Game Players: Hierarchical Banzhaf Interaction for
  Cross-Modal Representation Learning
Video-Text as Game Players: Hierarchical Banzhaf Interaction for Cross-Modal Representation Learning
Peng Jin
Jinfa Huang
Pengfei Xiong
Shangxuan Tian
Chang-rui Liu
Xiang Ji
Li-ming Yuan
Jie Chen
101
59
0
25 Mar 2023
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient
  Vision Transformers
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei
Brendan Duke
R. Jiang
P. Aarabi
Graham W. Taylor
Florian Shkurti
ViT
99
17
0
24 Mar 2023
MonoATT: Online Monocular 3D Object Detection with Adaptive Token
  Transformer
MonoATT: Online Monocular 3D Object Detection with Adaptive Token Transformer
Yunsong Zhou
Hongzi Zhu
Quan Liu
Shan Chang
Minyi Guo
ViT
118
25
0
23 Mar 2023
Making Vision Transformers Efficient from A Token Sparsification View
Making Vision Transformers Efficient from A Token Sparsification View
Shuning Chang
Pichao Wang
Ming Lin
Fan Wang
David Junhao Zhang
Rong Jin
Mike Zheng Shou
ViT
98
26
0
15 Mar 2023
Towards High-Quality and Efficient Video Super-Resolution via
  Spatial-Temporal Data Overfitting
Towards High-Quality and Efficient Video Super-Resolution via Spatial-Temporal Data Overfitting
Gen Li
Jie Ji
Minghai Qin
Wei Niu
Bin Ren
Fatemeh Afghah
Lin Guo
Xiaolong Ma
SupR
121
12
0
15 Mar 2023
Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep
  Ensembles are More Efficient than Single Models
Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models
Guoxuan Xia
C. Bouganis
UQCV
113
12
0
14 Mar 2023
Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm
Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm
Hengyuan Zhao
Hao Luo
Yuyang Zhao
Pichao Wang
F. Wang
Mike Zheng Shou
68
5
0
14 Mar 2023
Token Sparsification for Faster Medical Image Segmentation
Token Sparsification for Faster Medical Image Segmentation
Lei Zhou
Huidong Liu
Joseph Bae
Junjun He
Dimitris Samaras
Prateek Prasanna
MedIm
57
3
0
11 Mar 2023
Efficient Transformer-based 3D Object Detection with Dynamic Token
  Halting
Efficient Transformer-based 3D Object Detection with Dynamic Token Halting
Mao Ye
Gregory P. Meyer
Yuning Chai
Qiang Liu
71
9
0
09 Mar 2023
SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking
SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking
L. Yao
Changhong Fu
Sihang Li
Guang-Zheng Zheng
Junjie Ye
58
31
0
08 Mar 2023
Filter Pruning based on Information Capacity and Independence
Filter Pruning based on Information Capacity and Independence
Xiaolong Tang
Shuo Ye
Yufeng Shi
Tianheng Hu
Qinmu Peng
Xinge You
VLM
66
1
0
07 Mar 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging
Training-Free Acceleration of ViTs with Delayed Spatial Merging
J. Heo
Seyedarmin Azizi
A. Fayyazi
Massoud Pedram
123
3
0
04 Mar 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
  Collaborative AutoML System
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System
Chao Xue
Wen Liu
Shunxing Xie
Zhenfang Wang
Jiaxing Li
...
Shi-Yong Chen
Yibing Zhan
Jing Zhang
Chaoyue Wang
Dacheng Tao
94
2
0
01 Mar 2023
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic
  Speech Processing
SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Weidong Chen
Xiaofen Xing
Xiangmin Xu
Jianxin Pang
Lan Du
76
40
0
27 Feb 2023
Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto
  Heterogeneous MPSoCs
Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs
Halima Bouzidi
Mohanad Odema
Hamza Ouarnoughi
Smail Niar
Mohammad Abdullah Al Faruque
64
10
0
24 Feb 2023
Deep Learning for Video-Text Retrieval: a Review
Deep Learning for Video-Text Retrieval: a Review
Cunjuan Zhu
Qi Jia
Wei Chen
Yanming Guo
Yu Liu
75
18
0
24 Feb 2023
A residual dense vision transformer for medical image super-resolution
  with segmentation-based perceptual loss fine-tuning
A residual dense vision transformer for medical image super-resolution with segmentation-based perceptual loss fine-tuning
Jin Zhu
Guang Yang
Pietro Lio
ViTMedIm
80
5
0
22 Feb 2023
Stitchable Neural Networks
Stitchable Neural Networks
Zizheng Pan
Jianfei Cai
Bohan Zhuang
91
25
0
13 Feb 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning,
  Generalization, and Sample Complexity
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
Ming Wang
Sijia Liu
Pin-Yu Chen
ViTMLT
138
64
0
12 Feb 2023
UPop: Unified and Progressive Pruning for Compressing Vision-Language
  Transformers
UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers
Dachuan Shi
Chaofan Tao
Ying Jin
Zhendong Yang
Chun Yuan
Jiaqi Wang
VLMViT
106
39
0
31 Jan 2023
PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation
  Invariant Transformation
PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation
Ningxin Zheng
Huiqiang Jiang
Quan Zhang
Zhenhua Han
Yuqing Yang
...
Fan Yang
Chengruidong Zhang
Lili Qiu
Mao Yang
Lidong Zhou
102
29
0
26 Jan 2023
GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous
  Structured Pruning for Vision Transformer
GOHSP: A Unified Framework of Graph and Optimization-based Heterogeneous Structured Pruning for Vision Transformer
Miao Yin
Burak Uzkent
Yilin Shen
Hongxia Jin
Bo Yuan
ViT
69
16
0
13 Jan 2023
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
Ming Li
Xiangyu Xu
Hehe Fan
Pan Zhou
Jun Liu
Jia-Wei Liu
Jiahe Li
Jussi Keppo
Mike Zheng Shou
Shuicheng Yan
ViTPICV
104
13
0
08 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
96
30
0
05 Jan 2023
What Makes for Good Tokenizers in Vision Transformer?
What Makes for Good Tokenizers in Vision Transformer?
Shengju Qian
Yi Zhu
Wenbo Li
Mu Li
Jiaya Jia
ViT
91
14
0
21 Dec 2022
Attentive Mask CLIP
Attentive Mask CLIP
Yifan Yang
Weiquan Huang
Yixuan Wei
Houwen Peng
Xinyang Jiang
...
Fangyun Wei
Yin Wang
Han Hu
Lili Qiu
Yuqing Yang
CLIPVLM
83
27
0
16 Dec 2022
Rethinking Vision Transformers for MobileNet Size and Speed
Rethinking Vision Transformers for MobileNet Size and Speed
Yanyu Li
Ju Hu
Yang Wen
Georgios Evangelidis
Kamyar Salahi
Yanzhi Wang
Sergey Tulyakov
Jian Ren
ViT
112
169
0
15 Dec 2022
FlexiViT: One Model for All Patch Sizes
FlexiViT: One Model for All Patch Sizes
Lucas Beyer
Pavel Izmailov
Alexander Kolesnikov
Mathilde Caron
Simon Kornblith
Xiaohua Zhai
Matthias Minderer
Michael Tschannen
Ibrahim Alabdulmohsin
Filip Pavetić
VLM
148
94
0
15 Dec 2022
Most Important Person-guided Dual-branch Cross-Patch Attention for Group
  Affect Recognition
Most Important Person-guided Dual-branch Cross-Patch Attention for Group Affect Recognition
Hongxia Xie
Ming-Xian Lee
Tzu-Jui Chen
Hung-Jen Chen
Hou-I Liu
Hong-Han Shuai
Wen-Huang Cheng
CVBM
69
8
0
14 Dec 2022
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video
  Learning
Rethinking Video ViTs: Sparse Video Tubes for Joint Image and Video Learning
A. Piergiovanni
Weicheng Kuo
A. Angelova
ViT
84
58
0
06 Dec 2022
ResFormer: Scaling ViTs with Multi-Resolution Training
ResFormer: Scaling ViTs with Multi-Resolution Training
Rui Tian
Zuxuan Wu
Qiuju Dai
Hang-Rui Hu
Yu Qiao
Yu-Gang Jiang
ViT
86
35
0
01 Dec 2022
Dynamic Feature Pruning and Consolidation for Occluded Person
  Re-Identification
Dynamic Feature Pruning and Consolidation for Occluded Person Re-Identification
Yuteng Ye
Hang Zhou
Jiale Cai
Chenxing Gao
Youjia Zhang
Junle Wang
Qiang Hu
Junqing Yu
Wei Yang
62
6
0
27 Nov 2022
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language
  Pre-training
SMAUG: Sparse Masked Autoencoder for Efficient Video-Language Pre-training
Yuanze Lin
Chen Wei
Huiyu Wang
Alan Yuille
Cihang Xie
3DGS
109
15
0
21 Nov 2022
Beyond Attentive Tokens: Incorporating Token Importance and Diversity
  for Efficient Vision Transformers
Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers
Sifan Long
Z. Zhao
Jimin Pi
Sheng-sheng Wang
Jingdong Wang
85
39
0
21 Nov 2022
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating
  Unified Vision Language Model
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Sheng Tang
Yaqing Wang
Zhenglun Kong
Tianchi Zhang
Yao Li
Caiwen Ding
Yanzhi Wang
Yi Liang
Dongkuan Xu
84
34
0
21 Nov 2022
Peeling the Onion: Hierarchical Reduction of Data Redundancy for
  Efficient Vision Transformer Training
Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training
Zhenglun Kong
Haoyu Ma
Geng Yuan
Mengshu Sun
Yanyue Xie
...
Tianlong Chen
Xiaolong Ma
Xiaohui Xie
Zhangyang Wang
Yanzhi Wang
ViT
112
24
0
19 Nov 2022
TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer
TORE: Token Reduction for Efficient Human Mesh Recovery with Transformer
Zhiyang Dou
Qingxuan Wu
Chu-Hsing Lin
Zeyu Cao
Qiangqiang Wu
Weilin Wan
Taku Komura
Wenping Wang
87
40
0
19 Nov 2022
Rethinking Batch Sample Relationships for Data Representation: A
  Batch-Graph Transformer based Approach
Rethinking Batch Sample Relationships for Data Representation: A Batch-Graph Transformer based Approach
Xixi Wang
Bowei Jiang
Tianlin Li
Bin Luo
ViT
103
5
0
19 Nov 2022
Castling-ViT: Compressing Self-Attention via Switching Towards
  Linear-Angular Attention at Vision Transformer Inference
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
Haoran You
Yunyang Xiong
Xiaoliang Dai
Bichen Wu
Peizhao Zhang
Haoqi Fan
Peter Vajda
Yingyan Lin
131
34
0
18 Nov 2022
EfficientTrain: Exploring Generalized Curriculum Learning for Training
  Visual Backbones
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones
Yulin Wang
Yang Yue
Rui Lu
Tian-De Liu
Zhaobai Zhong
S. Song
Gao Huang
90
29
0
17 Nov 2022
Token Turing Machines
Token Turing Machines
Michael S. Ryoo
K. Gopalakrishnan
Kumara Kahatapitiya
Ted Xiao
Kanishka Rao
Austin Stone
Yao Lu
Julian Ibarz
Anurag Arnab
61
21
0
16 Nov 2022
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision
  Transformers
HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers
Peiyan Dong
Mengshu Sun
Alec Lu
Yanyue Xie
Li-Yu Daisy Liu
...
Xin Meng
Zechao Li
Xue Lin
Zhenman Fang
Yanzhi Wang
ViT
95
71
0
15 Nov 2022
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer
Fcaformer: Forward Cross Attention in Hybrid Vision Transformer
Haokui Zhang
Wenze Hu
Xiaoyu Wang
ViT
63
8
0
14 Nov 2022
Previous
123456789
Next