ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.00754
  4. Cited By
Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal
  Language Model

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model

1 August 2024
Benlin Liu
Yuhao Dong
Yiqin Wang
Yongming Rao
Yansong Tang
Wei-Chiu Ma
Ranjay Krishna
    LRM
ArXivPDFHTML

Papers citing "Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model"

2 / 2 papers shown
Title
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Yue Fan
Xiaojian Ma
Rujie Wu
Yuntao Du
Jiaqi Li
Zhi Gao
Qing Li
VLM
LLMAG
51
59
0
18 Mar 2024
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
322
4,300
0
30 Jan 2023
1