Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2412.20105
Cited By
ST
3
^3
3
: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming
31 December 2024
Jiedong Zhuang
Lu Lu
Ming Dai
Rui Hu
Jingshu Chen
Qiang Liu
Haoji Hu
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ST$^3$: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming"
2 / 2 papers shown
Title
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark
Enxin Song
Wenhao Chai
Weili Xu
Jianwen Xie
Yuxuan Liu
Gaoang Wang
120
6
0
20 Apr 2025
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Ming Dai
Jian Li
Jiedong Zhuang
Xian Zhang
Wankou Yang
ObjD
80
2
0
12 Jan 2025
1