Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.00754
Cited By
Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model
1 August 2024
Benlin Liu
Yuhao Dong
Yiqin Wang
Yongming Rao
Yansong Tang
Wei-Chiu Ma
Ranjay Krishna
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model"
2 / 2 papers shown
Title
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan
Xiaojian Ma
Rujie Wu
Yuntao Du
Jiaqi Li
Zhi Gao
Qing Li
VLM
LLMAG
51
59
0
18 Mar 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
322
4,300
0
30 Jan 2023
1