ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.05643
  4. Cited By
TRACE: Temporal Grounding Video LLM via Causal Event Modeling

TRACE: Temporal Grounding Video LLM via Causal Event Modeling

8 October 2024
Yongxin Guo
Jingyu Liu
Mingda Li
Xiaoying Tang
Qingbin Liu
Xiaoying Tang
ArXivPDFHTML

Papers citing "TRACE: Temporal Grounding Video LLM via Causal Event Modeling"

10 / 10 papers shown
Title
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action
Jen-Hao Cheng
Vivian Wang
Huayu Wang
Huapeng Zhou
Yi-Hao Peng
...
Wenhao Chai
Yi-Ling Chen
Vibhav Vineet
Qin Cai
Lei Li
AI4TS
172
0
0
02 May 2025
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark
Enxin Song
Wenhao Chai
Weili Xu
Jianwen Xie
Yuxuan Liu
Gaoang Wang
62
0
0
20 Apr 2025
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
Yunlong Tang
Jing Bi
Chao Huang
Susan Liang
Daiki Shimada
...
Jinxi He
Liu He
Zeliang Zhang
Jiebo Luo
Chenliang Xu
37
0
0
07 Apr 2025
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Junyu Xie
Tengda Han
Max Bain
Arsha Nagrani
Eshika Khandelwal
Gül Varol
Weidi Xie
Andrew Zisserman
DiffM
VGen
59
0
0
01 Apr 2025
V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning
Zixu Cheng
Jian Hu
Ziquan Liu
Chenyang Si
Wei Li
Shaogang Gong
LRM
78
2
0
14 Mar 2025
Measure Twice, Cut Once: Grasping Video Structures and Event Semantics with LLMs for Video Temporal Localization
Zongshang Pang
Mayu Otani
Yuta Nakashima
58
0
0
12 Mar 2025
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
Xinlong Chen
Yang Zhang
Chongling Rao
Yushuo Guan
Jiaheng Liu
Fuzheng Zhang
Chengru Song
Qiang Liu
Di Zhang
Tieniu Tan
17
0
0
18 Feb 2025
Do Language Models Understand Time?
Do Language Models Understand Time?
Xi Ding
Lei Wang
181
0
0
18 Dec 2024
TimeRefine: Temporal Grounding with Time Refining Video LLM
TimeRefine: Temporal Grounding with Time Refining Video LLM
Xizi Wang
Feng Cheng
Ziyang Wang
Huiyu Wang
Md. Mohaiminul Islam
Lorenzo Torresani
Joey Tianyi Zhou
Gedas Bertasius
David J. Crandall
109
1
0
12 Dec 2024
Explainable and Interpretable Multimodal Large Language Models: A
  Comprehensive Survey
Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Yunkai Dang
Kaichen Huang
Jiahao Huo
Yibo Yan
S. Huang
...
Kun Wang
Yong Liu
Jing Shao
Hui Xiong
Xuming Hu
LRM
104
15
0
03 Dec 2024
1