Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.11494
Cited By
Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More
17 February 2025
Zichen Wen
Yifeng Gao
Shaobo Wang
J.N. Zhang
Qintong Zhang
Weijia Li
Conghui He
Linfeng Zhang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"
7 / 7 papers shown
Title
VScan: Rethinking Visual Token Reduction for Efficient Large Vision-Language Models
Ce Zhang
Kaixin Ma
Tianqing Fang
Wenhao Yu
Hongming Zhang
Zhisong Zhang
Yaqi Xie
Katia Sycara
Haitao Mi
Dong Yu
VLM
24
0
0
28 May 2025
ToDRE: Visual Token Pruning via Diversity and Task Awareness for Efficient Large Vision-Language Models
Duo Li
Zuhao Yang
Shijian Lu
VLM
35
0
0
24 May 2025
QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design
Benjamin Schneider
Dongfu Jiang
Chao Du
Tianyu Pang
Wenhu Chen
VLM
20
0
0
22 May 2025
FLASH: Latent-Aware Semi-Autoregressive Speculative Decoding for Multimodal Tasks
Zihua Wang
Ruibo Li
Haozhe Du
Joey Tianyi Zhou
Yu Zhang
Xu Yang
MLLM
61
0
0
19 May 2025
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos
Linli Yao
You Li
Y. X. Wei
Lei Li
Shuhuai Ren
...
Sida Li
Dianbo Sui
Qi Liu
Yanzhe Zhang
Xu Sun
71
1
0
24 Apr 2025
LEO-MINI: An Efficient Multimodal Large Language Model using Conditional Token Reduction and Mixture of Multi-Modal Experts
Yimu Wang
Mozhgan Nasr Azadani
Sean Sedwards
Krzysztof Czarnecki
MLLM
MoE
72
0
0
07 Apr 2025
VideoScan: Enabling Efficient Streaming Video Understanding via Frame-level Semantic Carriers
Ruanjun Li
Yuedong Tan
Yuanming Shi
Jiawei Shao
VLM
295
0
0
12 Mar 2025
1