Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.01328
Cited By
Slow-Fast Architecture for Video Multi-Modal Large Language Models
2 April 2025
Min Shi
Shihao Wang
Chieh-Yun Chen
Jitesh Jain
Kai Wang
Junjun Xiong
Guilin Liu
Zhiding Yu
Humphrey Shi
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Slow-Fast Architecture for Video Multi-Modal Large Language Models"
4 / 4 papers shown
Title
Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
Shiao Wang
Xiao Wang
Liye Jin
Bo Jiang
Lin Zhu
Lan Chen
Yonghong Tian
Bin Luo
73
0
0
19 May 2025
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model
Yang Shi
Jiaheng Liu
Yushuo Guan
Zhikai Wu
Yize Zhang
...
Bohan Zeng
Wei Zhang
Fuzheng Zhang
Wenjing Yang
Di Zhang
VGen
VLM
136
2
0
14 Apr 2025
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai
Enxin Song
Y. Du
Chenlin Meng
Vashisht Madhavan
Omer Bar-Tal
Jeng-Neng Hwang
Saining Xie
Christopher D. Manning
3DV
219
37
0
04 Oct 2024
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
Zuyan Liu
Yuhao Dong
Ziwei Liu
Winston Hu
Jiwen Lu
Yongming Rao
ObjD
214
72
0
19 Sep 2024
1