Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.02092
Cited By
Make A Long Image Short: Adaptive Token Length for Vision Transformers
5 July 2023
Yuqin Zhu
Yichen Zhu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Make A Long Image Short: Adaptive Token Length for Vision Transformers"
19 / 19 papers shown
Title
When Less is Enough: Adaptive Token Reduction for Efficient Image Representation
Eduard Allakhverdov
Elizaveta Goncharova
Andrey Kuznetsov
42
0
0
20 Mar 2025
Tackling the Abstraction and Reasoning Corpus with Vision Transformers: the Importance of 2D Representation, Positions, and Objects
Wenhao Li
Yudong Xu
Scott Sanner
Elias Boutros Khalil
ViT
36
3
0
08 Oct 2024
Agglomerative Token Clustering
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
39
1
0
18 Sep 2024
When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel Perspective
Qiqi Zhou
Yichen Zhu
ViT
16
1
0
15 Mar 2024
Visual Robotic Manipulation with Depth-Aware Pretraining
Wanying Wang
Jinming Li
Yichen Zhu
Zhiyuan Xu
Zhengping Che
Yaxin Peng
Chaomin Shen
Dong Liu
Feifei Feng
Jian Tang
MDE
32
3
0
17 Jan 2024
Object-Centric Instruction Augmentation for Robotic Manipulation
Junjie Wen
Yichen Zhu
Minjie Zhu
Jinming Li
Zhiyuan Xu
...
Chaomin Shen
Yaxin Peng
Dong Liu
Feifei Feng
Jian Tang
LM&Ro
69
16
0
05 Jan 2024
Which Tokens to Use? Investigating Token Reduction in Vision Transformers
Joakim Bruslund Haurum
Sergio Escalera
Graham W. Taylor
T. Moeslund
ViT
40
33
0
09 Aug 2023
MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers
Jakob Drachmann Havtorn
Amelie Royer
Tijmen Blankevoort
B. Bejnordi
30
8
0
05 Jul 2023
Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization
Mengqi Huang
Zhendong Mao
Zhuowei Chen
Yongdong Zhang
MQ
35
35
0
19 May 2023
Efficient Transformer-based 3D Object Detection with Dynamic Token Halting
Mao Ye
Gregory P. Meyer
Yuning Chai
Qiang Liu
32
8
0
09 Mar 2023
Super Vision Transformer
Mingbao Lin
Mengzhao Chen
Yu-xin Zhang
Yunhang Shen
Rongrong Ji
Liujuan Cao
ViT
43
20
0
23 May 2022
UniLog: Deploy One Model and Specialize it for All Log Analysis Tasks
Yichen Zhu
Weibin Meng
Ying Liu
Shenglin Zhang
Tao Han
Shimin Tao
Dan Pei
MoE
41
14
0
06 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
Token Pooling in Vision Transformers
D. Marin
Jen-Hao Rick Chang
Anurag Ranjan
Anish K. Prabhu
Mohammad Rastegari
Oncel Tuzel
ViT
76
66
0
08 Oct 2021
Intriguing Properties of Vision Transformers
Muzammal Naseer
Kanchana Ranasinghe
Salman Khan
Munawar Hayat
F. Khan
Ming-Hsuan Yang
ViT
262
621
0
21 May 2021
Visformer: The Vision-friendly Transformer
Zhengsu Chen
Lingxi Xie
Jianwei Niu
Xuefeng Liu
Longhui Wei
Qi Tian
ViT
120
209
0
26 Apr 2021
Transformer in Transformer
Kai Han
An Xiao
Enhua Wu
Jianyuan Guo
Chunjing Xu
Yunhe Wang
ViT
289
1,524
0
27 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
280
1,982
0
09 Feb 2021
Neural Architecture Search with Reinforcement Learning
Barret Zoph
Quoc V. Le
271
5,329
0
05 Nov 2016
1