Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2409.18938
Cited By
From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding
27 September 2024
Heqing Zou
Tianze Luo
Guiyang Xie
Victor
Zhang
Fengmao Lv
Guangcong Wang
Juanyang Chen
Zhuochen Wang
Hansheng Zhang
Huaijian Zhang
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding"
3 / 3 papers shown
Title
DyGEnc: Encoding a Sequence of Textual Scene Graphs to Reason and Answer Questions in Dynamic Scenes
S. Linok
Vadim Semenov
Anastasia Trunova
Oleg Bulichev
Dmitry A. Yudin
52
0
0
06 May 2025
Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding
Weiyu Guo
Ziyang Chen
Shaoguang Wang
Jianxiang He
Yijie Xu
Jinhui Ye
Ying Sun
Hui Xiong
49
1
0
17 Mar 2025
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding
Xiao Wang
Qingyi Si
Jianlong Wu
Shiyu Zhu
Zheng Lin
Liqiang Nie
VLM
85
6
0
29 Dec 2024
1