Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.09461
Cited By
Token Merging: Your ViT But Faster
17 October 2022
Daniel Bolya
Cheng-Yang Fu
Xiaoliang Dai
Peizhao Zhang
Christoph Feichtenhofer
Judy Hoffman
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Token Merging: Your ViT But Faster"
50 / 321 papers shown
Title
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation
Wangbo Zhao
Jiasheng Tang
Yizeng Han
Yibing Song
Kai Wang
Gao Huang
F. Wang
Yang You
40
11
0
18 Mar 2024
Semantic Prompting with Image-Token for Continual Learning
Jisu Han
Jaemin Na
Wonjun Hwang
CLL
VLM
45
1
0
18 Mar 2024
Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification
Pingping Zhang
Yuhao Wang
Yang Liu
Zhengzheng Tu
Huchuan Lu
23
21
0
15 Mar 2024
Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers
Sanghyeok Lee
Joonmyung Choi
Hyunwoo J. Kim
ViT
45
7
0
15 Mar 2024
On the Utility of 3D Hand Poses for Action Recognition
Md Salman Shamil
Dibyadip Chatterjee
Fadime Sener
Shugao Ma
Angela Yao
37
5
0
14 Mar 2024
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot
Adrian Lañcucki
Marcin Chochowski
David Tarjan
E. Ponti
33
50
0
14 Mar 2024
PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Yizhe Xiong
Hui Chen
Tianxiang Hao
Zijia Lin
Jungong Han
Yuesong Zhang
Guoxin Wang
Yongjun Bao
Guiguang Ding
48
16
0
14 Mar 2024
Conditional computation in neural networks: principles and research trends
Simone Scardapane
Alessandro Baiocchi
Alessio Devoto
V. Marsocci
Pasquale Minervini
Jary Pomponi
34
1
0
12 Mar 2024
SPA: Towards A Computational Friendly Cloud-Base and On-Devices Collaboration Seq2seq Personalized Generation
Yanming Liu
Xinyue Peng
Jiannan Cao
Le Dai
Xingzu Liu
Mingbang Wang
Weihao Liu
SyDa
41
2
0
11 Mar 2024
TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document
Yuliang Liu
Biao Yang
Qiang Liu
Zhang Li
Zhiyin Ma
Shuo Zhang
Xiang Bai
MLLM
VLM
49
88
0
07 Mar 2024
Online Adaptation of Language Models with a Memory of Amortized Contexts
Jihoon Tack
Jaehyung Kim
Eric Mitchell
Jinwoo Shin
Yee Whye Teh
Jonathan Richard Schwarz
KELM
47
18
0
07 Mar 2024
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Jianjian Cao
Peng Ye
Shengze Li
Chong Yu
Yansong Tang
Jiwen Lu
Tao Chen
32
15
0
05 Mar 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
75
259
0
27 Feb 2024
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching
Zizheng Pan
Bohan Zhuang
De-An Huang
Weili Nie
Zhiding Yu
Chaowei Xiao
Jianfei Cai
A. Anandkumar
33
17
0
21 Feb 2024
ToDo: Token Downsampling for Efficient Generation of High-Resolution Images
Ethan Smith
Nayan Saxena
Aninda Saha
DiffM
30
5
0
21 Feb 2024
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Harry Dong
Xinyu Yang
Zhenyu (Allen) Zhang
Zhangyang Wang
Yuejie Chi
Beidi Chen
27
49
0
14 Feb 2024
LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model
Dilxat Muhtar
Zhenshi Li
Feng-Xue Gu
Xue-liang Zhang
P. Xiao
78
49
0
04 Feb 2024
Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model
Zihan Zhong
Zhiqiang Tang
Tong He
Haoyang Fang
Chun Yuan
46
41
0
31 Jan 2024
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
37
27
0
15 Jan 2024
Object-Centric Diffusion for Efficient Video Editing
Kumara Kahatapitiya
Adil Karjauv
Davide Abati
Fatih Porikli
Yuki M. Asano
A. Habibian
VGen
37
12
0
11 Jan 2024
OTAS: An Elastic Transformer Serving System via Token Adaptation
Jinyu Chen
Wenchao Xu
Zicong Hong
Song Guo
Yining Qi
Jie Zhang
Deze Zeng
25
4
0
10 Jan 2024
EmMixformer: Mix transformer for eye movement recognition
Huafeng Qin
Hongyu Zhu
Xin Jin
Qun Song
M. El-Yacoubi
Xinbo Gao
41
7
0
10 Jan 2024
Morphing Tokens Draw Strong Masked Image Models
Taekyung Kim
Byeongho Heo
Dongyoon Han
54
3
0
30 Dec 2023
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Ping Luo
Jiebo Luo
Chenliang Xu
VLM
54
83
0
29 Dec 2023
SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation
Xiaoqi An
Lin Zhao
Chen Gong
Nannan Wang
Di Wang
Jian Yang
3DH
ViT
30
7
0
17 Dec 2023
VidToMe: Video Token Merging for Zero-Shot Video Editing
Xirui Li
Chao Ma
Xiaokang Yang
Ming-Hsuan Yang
DiffM
VGen
29
40
0
17 Dec 2023
Adaptive Computation Modules: Granular Conditional Computation For Efficient Inference
Bartosz Wójcik
Alessio Devoto
Karol Pustelnik
Pasquale Minervini
Simone Scardapane
20
5
0
15 Dec 2023
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Senmao Li
Taihang Hu
Fahad Shahbaz Khan
Linxuan Li
Shiqi Yang
Yaxing Wang
Ming-Ming Cheng
Jian Yang
DiffM
34
1
0
15 Dec 2023
Agent Attention: On the Integration of Softmax and Linear Attention
Dongchen Han
Tianzhu Ye
Yizeng Han
Zhuofan Xia
Siyuan Pan
Pengfei Wan
Shiji Song
Gao Huang
32
74
0
14 Dec 2023
Turbo: Informativity-Driven Acceleration Plug-In for Vision-Language Models
Chen Ju
Haicheng Wang
Zeqian Li
Xu Chen
Zhonghua Zhai
Weilin Huang
Shuai Xiao
VLM
73
7
0
12 Dec 2023
F3-Pruning: A Training-Free and Generalized Pruning Strategy towards Faster and Finer Text-to-Video Synthesis
Sitong Su
Jianzhi Liu
Lianli Gao
Jingkuan Song
DiffM
VGen
19
4
0
06 Dec 2023
Bootstrapping SparseFormers from Vision Foundation Models
Ziteng Gao
Zhan Tong
K. Lin
Joya Chen
Mike Zheng Shou
35
0
0
04 Dec 2023
A Comprehensive Study of Vision Transformers in Image Classification Tasks
Mahmoud Khalil
Ahmad Khalil
A. Ngom
ViT
18
8
0
02 Dec 2023
Token Fusion: Bridging the Gap between Token Pruning and Token Merging
Minchul Kim
Shangqian Gao
Yen-Chang Hsu
Yilin Shen
Hongxia Jin
23
29
0
02 Dec 2023
Merlin:Empowering Multimodal LLMs with Foresight Minds
En Yu
Liang Zhao
Yana Wei
Jinrong Yang
Dongming Wu
...
Haoran Wei
Tiancai Wang
Zheng Ge
Xiangyu Zhang
Wenbing Tao
LRM
18
25
0
30 Nov 2023
Perceptual Group Tokenizer: Building Perception with Iterative Grouping
Zhiwei Deng
Ting Chen
Yang Li
ViT
VLM
23
2
0
30 Nov 2023
Align before Adapt: Leveraging Entity-to-Region Alignments for Generalizable Video Action Recognition
Yifei Chen
Dapeng Chen
Ruijin Liu
Sai Zhou
Wenyuan Xue
Wei Peng
25
6
0
27 Nov 2023
Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
Wenhao Li
Mengyuan Liu
Hong Liu
Pichao Wang
Jia Cai
N. Sebe
ViT
3DH
25
10
0
20 Nov 2023
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
Yunshan Zhong
Jiawei Hu
Mingbao Lin
Mengzhao Chen
Rongrong Ji
MQ
30
10
0
16 Nov 2023
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
Peng Jin
Ryuichi Takanobu
Caiwan Zhang
Xiaochun Cao
Li-ming Yuan
MLLM
36
223
0
14 Nov 2023
To Transformers and Beyond: Large Language Models for the Genome
Micaela Elisa Consens
Cameron Dufault
Michael Wainberg
Duncan Forster
Mehran Karimzadeh
Hani Goodarzi
Fabian J. Theis
Alan Moses
Bo Wang
LM&MA
MedIm
18
26
0
13 Nov 2023
Multi-resolution Time-Series Transformer for Long-term Forecasting
Yitian Zhang
Liheng Ma
Soumyasundar Pal
Yingxue Zhang
Mark J. Coates
AI4TS
31
27
0
07 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
33
2
0
06 Nov 2023
GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation
Xuwei Xu
Sen Wang
Yudong Chen
Yanping Zheng
Zhewei Wei
Jiajun Liu
ViT
24
8
0
06 Nov 2023
AiluRus: A Scalable ViT Framework for Dense Prediction
Jin Li
Yaoming Wang
Xiaopeng Zhang
Bowen Shi
Dongsheng Jiang
Chenglin Li
Wenrui Dai
Hongkai Xiong
Qi Tian
57
5
0
02 Nov 2023
TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding
Shuhuai Ren
Sishuo Chen
Shicheng Li
Xu Sun
Lu Hou
ViT
43
28
0
29 Oct 2023
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
Yangyang Guo
Guangzhi Wang
Mohan S. Kankanhalli
21
3
0
16 Oct 2023
Reusing Pretrained Models by Multi-linear Operators for Efficient Training
Yu Pan
Ye Yuan
Yichun Yin
Zenglin Xu
Lifeng Shang
Xin Jiang
Qun Liu
44
16
0
16 Oct 2023
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Jing Liu
Ruihao Gong
Xiuying Wei
Zhiwei Dong
Jianfei Cai
Bohan Zhuang
MQ
23
51
0
12 Oct 2023
Accelerating Vision Transformers Based on Heterogeneous Attention Patterns
Deli Yu
Teng Xi
Jianwei Li
Baopu Li
Gang Zhang
Haocheng Feng
Junyu Han
Jingtuo Liu
Errui Ding
Jingdong Wang
ViT
31
0
0
11 Oct 2023
Previous
1
2
3
4
5
6
7
Next