ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.03201
  4. Cited By
3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding

3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding

6 January 2024
Zeju Li
Chao Zhang
Xiaoyan Wang
Ruilong Ren
Yifan Xu
Ruifei Ma
Xiangde Liu
    MLLM
ArXivPDFHTML

Papers citing "3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding"

15 / 15 papers shown
Title
The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?
The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?
Weichen Zhang
Ruiying Peng
Chen Gao
Jianjie Fang
Xin Zeng
...
Z. Wang
Jinqiang Cui
Xin Wang
Xinlei Chen
Y. Li
LRM
71
0
0
06 Apr 2025
Out-of-Distribution Radar Detection in Compound Clutter and Thermal Noise through Variational Autoencoders
Y A Rouzoumka
E Terreaux
C. Morisseau
J. Ovarlez
C. Ren
51
2
0
06 Mar 2025
Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Hanxun Yu
Wentong Li
Song Wang
J. Chen
Jianke Zhu
3DV
LRM
77
3
0
01 Mar 2025
Omni-SILA: Towards Omni-scene Driven Visual Sentiment Identifying, Locating and Attributing in Videos
Jiamin Luo
Jingjing Wang
Junxiao Ma
Yujie Jin
Shoushan Li
Guodong Zhou
31
0
0
26 Feb 2025
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following
  Models Need for Efficient Generation
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
Ao Wang
Hui Chen
Jianchao Tan
K. Zhang
Xunliang Cai
Zijia Lin
J. Han
Guiguang Ding
VLM
77
3
0
04 Dec 2024
SYNERGAI: Perception Alignment for Human-Robot Collaboration
SYNERGAI: Perception Alignment for Human-Robot Collaboration
Yixin Chen
Guoxi Zhang
Yaowei Zhang
Hongming Xu
Peiyuan Zhi
Qing Li
Siyuan Huang
32
0
0
24 Sep 2024
Visual Prompting in Multimodal Large Language Models: A Survey
Visual Prompting in Multimodal Large Language Models: A Survey
Junda Wu
Zhehao Zhang
Yu Xia
Xintong Li
Zhaoyang Xia
...
Subrata Mitra
Dimitris N. Metaxas
Lina Yao
Jingbo Shang
Julian McAuley
VLM
LRM
53
12
0
05 Sep 2024
A Comprehensive Review of Multimodal Large Language Models: Performance
  and Challenges Across Different Tasks
A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks
Jiaqi Wang
Hanqi Jiang
Yi-Hsueh Liu
Chong Ma
Xu-Yao Zhang
...
Xin Zhang
Wei Zhang
Dinggang Shen
Tianming Liu
Shu Zhang
VLM
AI4TS
46
30
0
02 Aug 2024
The Synergy between Data and Multi-Modal Large Language Models: A Survey
  from Co-Development Perspective
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective
Zhen Qin
Daoyuan Chen
Wenhao Zhang
Liuyi Yao
Yilun Huang
Bolin Ding
Yaliang Li
Shuiguang Deng
57
5
0
11 Jul 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances,
  and Future Directions
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
35
9
0
09 Jun 2024
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang
Xuweiyi Chen
Nikhil Madaan
Madhavan Iyengar
Shengyi Qian
David Fouhey
Joyce Chai
3DV
73
11
0
07 Jun 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
33
12
0
16 May 2024
Agent3D-Zero: An Agent for Zero-shot 3D Understanding
Agent3D-Zero: An Agent for Zero-shot 3D Understanding
Sha Zhang
Di Huang
Jiajun Deng
Shixiang Tang
Wanli Ouyang
Tong He
Yanyong Zhang
VGen
38
13
0
18 Mar 2024
MM-LLMs: Recent Advances in MultiModal Large Language Models
MM-LLMs: Recent Advances in MultiModal Large Language Models
Duzhen Zhang
Yahan Yu
Jiahua Dong
Chenxing Li
Dan Su
Chenhui Chu
Dong Yu
OffRL
LRM
50
178
0
24 Jan 2024
AffordanceLLM: Grounding Affordance from Vision Language Models
AffordanceLLM: Grounding Affordance from Vision Language Models
Shengyi Qian
Weifeng Chen
Min Bai
Xiong Zhou
Zhuowen Tu
Li Erran Li
17
20
0
12 Jan 2024
1