Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.02034
Cited By
v1
v2 (latest)
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
3 June 2021
Yongming Rao
Wenliang Zhao
Benlin Liu
Jiwen Lu
Jie Zhou
Cho-Jui Hsieh
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (608★)
Papers citing
"DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification"
50 / 444 papers shown
Title
Advancing Vision Transformers with Group-Mix Attention
Chongjian Ge
Xiaohan Ding
Zhan Tong
Li Yuan
Jiangliu Wang
Yibing Song
Ping Luo
166
18
0
26 Nov 2023
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
Wenhao Li
Mengyuan Liu
Hong Liu
Pichao Wang
Jia Cai
N. Sebe
ViT
3DH
71
12
0
20 Nov 2023
Improved TokenPose with Sparsity
Anning Li
ViT
68
0
0
16 Nov 2023
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Peng Jin
Ryuichi Takanobu
Caiwan Zhang
Xiaochun Cao
Li-ming Yuan
MLLM
138
249
0
14 Nov 2023
Explainability of Vision Transformers: A Comprehensive Review and New Perspectives
Rojina Kashefi
Leili Barekatain
Mohammad Sabokrou
Fatemeh Aghaeipoor
ViT
105
10
0
12 Nov 2023
A Hierarchical Spatial Transformer for Massive Point Samples in Continuous Space
Wenchong He
Zhe Jiang
Tingsong Xiao
Zelin Xu
Shigang Chen
Ronald Fick
Miles Medina
Christine Angelini
87
17
0
08 Nov 2023
FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer
Chi-Chih Chang
Yuan-Yao Sung
Shixing Yu
N. Huang
Diana Marculescu
Kai-Chiang Wu
ViT
54
1
0
07 Nov 2023
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
Xuwei Xu
Sen Wang
Yudong Chen
Yanping Zheng
Zhewei Wei
Jiajun Liu
ViT
98
12
0
06 Nov 2023
TokenMotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection Via Learnable Token Selection
Zifan Yu
E. Tavakoli
Meida Chen
Suya You
Raghuveer Rao
Sanjeev Agarwal
Fengbo Ren
73
2
0
05 Nov 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
Jin Li
Yaoming Wang
Xiaopeng Zhang
Bowen Shi
Dongsheng Jiang
Chenglin Li
Wenrui Dai
Hongkai Xiong
Qi Tian
119
5
0
02 Nov 2023
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
François Fleuret
87
2
0
01 Nov 2023
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Shuhuai Ren
Sishuo Chen
Shicheng Li
Xu Sun
Lu Hou
ViT
92
34
0
29 Oct 2023
Bridging The Gaps Between Token Pruning and Full Pre-training via Masked Fine-tuning
Fengyuan Shi
Limin Wang
ViT
52
0
0
26 Oct 2023
MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory
Yinan Liang
Ziwei Wang
Xiuwei Xu
Yansong Tang
Jie Zhou
Jiwen Lu
92
10
0
25 Oct 2023
USDC: Unified Static and Dynamic Compression for Visual Transformer
Huan Yuan
Chao Liao
Jianchao Tan
Peng Yao
Jiyuan Jia
Bin Chen
Chengru Song
Di Zhang
ViT
29
0
0
17 Oct 2023
Accelerating Vision Transformers Based on Heterogeneous Attention Patterns
Deli Yu
Teng Xi
Jianwei Li
Baopu Li
Gang Zhang
Haocheng Feng
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
ViT
69
1
0
11 Oct 2023
LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
Huiqiang Jiang
Qianhui Wu
Chin-Yew Lin
Yuqing Yang
Lili Qiu
109
118
0
09 Oct 2023
No Token Left Behind: Efficient Vision Transformer via Dynamic Token Idling
Xuwei Xu
Changlin Li
Yudong Chen
Xiaojun Chang
Jiajun Liu
Sen Wang
ViT
68
6
0
09 Oct 2023
Plug n' Play: Channel Shuffle Module for Enhancing Tiny Vision Transformers
Xuwei Xu
Sen Wang
Yudong Chen
Jiajun Liu
ViT
59
1
0
09 Oct 2023
ObjFormer: Learning Land-Cover Changes From Paired OSM Data and Optical High-Resolution Imagery via Object-Guided Transformer
Hongruixuan Chen
Cuiling Lan
Jian Song
Clifford Broni-Bediako
Junshi Xia
Naoto Yokoya
121
18
0
04 Oct 2023
SlowFormer: Universal Adversarial Patch for Attack on Compute and Energy Efficiency of Inference Efficient Vision Transformers
K. Navaneet
Soroush Abbasi Koohpayegani
Essam Sleiman
Hamed Pirsiavash
AAML
ViT
55
3
0
04 Oct 2023
PPT: Token Pruning and Pooling for Efficient Vision Transformers
Xinjian Wu
Fanhu Zeng
Xiudong Wang
Xinghao Chen
ViT
91
27
0
03 Oct 2023
Win-Win: Training High-Resolution Vision Transformers from Two Windows
Vincent Leroy
Jérôme Revaud
Thomas Lucas
Philippe Weinzaepfel
ViT
107
2
0
01 Oct 2023
CAIT: Triple-Win Compression towards High Accuracy, Fast Inference, and Favorable Transferability For ViTs
Ao Wang
Hui Chen
Zijia Lin
Sicheng Zhao
Jiawei Han
Guiguang Ding
ViT
56
6
0
27 Sep 2023
Beyond Grids: Exploring Elastic Input Sampling for Vision Transformers
Adam Pardyl
Grzegorz Kurzejamski
Jan Olszewski
Tomasz Trzciñski
Bartosz Zieliñski
54
1
0
23 Sep 2023
RTrack: Accelerating Convergence for Visual Object Tracking via Pseudo-Boxes Exploration
Guotian Zeng
Bi Zeng
Hong Zhang
Jianqi Liu
Qingmao Wei
56
1
0
23 Sep 2023
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels
Henry Hengyuan Zhao
Pichao Wang
Yuyang Zhao
Hao Luo
F. Wang
Mike Zheng Shou
ViT
100
14
0
15 Sep 2023
Dynamic Spectrum Mixer for Visual Recognition
Zhiqiang Hu
Tao Yu
56
3
0
13 Sep 2023
Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Yang Jin
Kun Xu
Kun Xu
Liwei Chen
Chao Liao
...
Xiaoqiang Lei
Di Zhang
Wenwu Ou
Kun Gai
Yadong Mu
MLLM
VLM
79
50
0
09 Sep 2023
LLMCad: Fast and Scalable On-device Large Language Model Inference
Daliang Xu
Wangsong Yin
Xin Jin
Yanzhe Zhang
Shiyun Wei
Mengwei Xu
Xuanzhe Liu
69
50
0
08 Sep 2023
ProPainter: Improving Propagation and Transformer for Video Inpainting
Shangchen Zhou
Chongyi Li
Kelvin C. K. Chan
Chen Change Loy
ViT
110
105
0
07 Sep 2023
A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking
Lorenzo Papa
Paolo Russo
Irene Amerini
Luping Zhou
98
45
0
05 Sep 2023
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers
Matthew Dutson
Yin Li
M. Gupta
ViT
85
10
0
25 Aug 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
143
4
0
18 Aug 2023
Which Tokens to Use? Investigating Token Reduction in Vision Transformers
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
ViT
102
38
0
09 Aug 2023
Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation
Shuangrui Ding
Peisen Zhao
Xiaopeng Zhang
Rui Qian
H. Xiong
Qi Tian
ViT
78
18
0
08 Aug 2023
DiT: Efficient Vision Transformers with Dynamic Token Routing
Yuchen Ma
Zhengcong Fei
Junshi Huang
ViT
48
2
0
07 Aug 2023
Redundancy-aware Transformer for Video Question Answering
Yicong Li
Xun Yang
An Zhang
Chun Feng
Xiang Wang
Tat-Seng Chua
72
16
0
07 Aug 2023
Learning Implicit Entity-object Relations by Bidirectional Generative Alignment for Multimodal NER
Feng Chen
Jiajia Liu
Kaixiang Ji
Wang Ren
Jian Wang
Jingdong Wang
46
10
0
03 Aug 2023
Dynamic Token-Pass Transformers for Semantic Segmentation
Yuang Liu
Qiang Zhou
Jing Wang
Fan Wang
Jun Wang
Wei Zhang
ViT
35
5
0
03 Aug 2023
Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation
Quan Tang
Bowen Zhang
Jiajun Liu
Fagui Liu
Yifan Liu
ViT
113
30
0
02 Aug 2023
HandMIM: Pose-Aware Self-Supervised Learning for 3D Hand Mesh Estimation
Zuyan Liu
Gaojie Lin
Congyi Wang
Min Zheng
Feida Zhu
3DH
68
0
0
29 Jul 2023
Less is More: Focus Attention for Efficient DETR
Dehua Zheng
Wenhui Dong
Hailin Hu
Xinghao Chen
Yunhe Wang
70
65
0
24 Jul 2023
Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model
Peng Wu
Jing Liu
Xiangteng He
Yuxin Peng
Peng Wang
Yanning Zhang
124
34
0
24 Jul 2023
Learned Thresholds Token Merging and Pruning for Vision Transformers
Maxim Bonnaerens
J. Dambre
98
23
0
20 Jul 2023
Scale-Aware Modulation Meet Transformer
Wei-Shiang Lin
Ziheng Wu
Jiayu Chen
Jun Huang
Lianwen Jin
MoE
ViT
110
77
0
17 Jul 2023
BUS:Efficient and Effective Vision-language Pre-training with Bottom-Up Patch Summarization
Chaoya Jiang
Haiyang Xu
Wei Ye
Qinghao Ye
Chenliang Li
Mingshi Yan
Bin Bi
Shikun Zhang
Fei Huang
Songfang Huang
VLM
50
9
0
17 Jul 2023
A Survey of Techniques for Optimizing Transformer Inference
Krishna Teja Chitty-Venkata
Sparsh Mittal
M. Emani
V. Vishwanath
Arun Somani
123
73
0
16 Jul 2023
Learning Sparse Neural Networks with Identity Layers
Mingjian Ni
Guangyao Chen
Xiawu Zheng
Peixi Peng
Liuliang Yuan
Yonghong Tian
59
0
0
14 Jul 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
81
8
0
05 Jul 2023
Previous
1
2
3
4
5
6
7
8
9
Next