ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.18651
  4. Cited By
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding,
  Reasoning, and Planning

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

30 November 2023
Sijin Chen
Xin Chen
C. Zhang
Mingsheng Li
Gang Yu
Hao Fei
Erik Cambria
Jiayuan Fan
Tao Chen
    MLLM
ArXiv (abs)PDFHTML

Papers citing "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"

24 / 74 papers shown
Title
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning
  Instruction Using Language Model
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
Wenqi Zhang
Zhenglin Cheng
Yuanyu He
Mengna Wang
Yongliang Shen
...
Guiyang Hou
Mingqian He
Yanna Ma
Weiming Lu
Yueting Zhuang
SyDa
184
13
0
09 Jul 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu
Tai Wang
Wenwei Zhang
Kai Chen
Xihui Liu
ReLMLRM
112
24
0
01 Jul 2024
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control
LiveScene: Language Embedding Interactive Radiance Fields for Physical Scene Rendering and Control
Delin Qu
Qizhi Chen
Pingrui Zhang
Xianqiang Gao
Bin Zhao
Bin Zhao
Dong Wang
Xuelong Li
AI4CE
119
8
0
23 Jun 2024
Lightweight Model Pre-training via Language Guided Knowledge
  Distillation
Lightweight Model Pre-training via Language Guided Knowledge Distillation
Mingsheng Li
Lin Zhang
Mingzhen Zhu
Zilong Huang
Gang Yu
Jiayuan Fan
Tao Chen
85
2
0
17 Jun 2024
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Ruiyuan Lyu
Tai Wang
Jingli Lin
Shuai Yang
Xiaohan Mao
...
Runsen Xu
Haifeng Huang
Chenming Zhu
Dahua Lin
Jiangmiao Pang
3DV
103
18
0
13 Jun 2024
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang
Xuweiyi Chen
Nikhil Madaan
Madhavan Iyengar
Shengyi Qian
David Fouhey
Joyce Chai
3DV
162
16
0
07 Jun 2024
LLplace: The 3D Indoor Scene Layout Generation and Editing via Large
  Language Model
LLplace: The 3D Indoor Scene Layout Generation and Editing via Large Language Model
Yixuan Yang
Junru Lu
Zixiang Zhao
Zhen Luo
James J.Q. Yu
Victor Sanchez
Feng Zheng
3DV
100
7
0
06 Jun 2024
Collaborative Novel Object Discovery and Box-Guided Cross-Modal
  Alignment for Open-Vocabulary 3D Object Detection
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Yang Cao
Yihan Zeng
Hang Xu
Dan Xu
3DPCObjD
80
6
0
02 Jun 2024
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
Sijin Chen
Xin Chen
Anqi Pang
Xianfang Zeng
Wei Cheng
...
C. Zhang
Jingyi Yu
Gang Yu
Bin-Bin Fu
Tao Chen
AI4CE
133
43
0
31 May 2024
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot
  Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language
  Models
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models
Tianrun Chen
Chunan Yu
Jing Li
Jianqi Zhang
Lanyun Zhu
Deyi Ji
Yong Zhang
Ying Zang
Zejian Li
Lingyun Sun
LRM
107
9
0
29 May 2024
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
Kuan-Chih Huang
Xiangtai Li
Lu Qi
Shuicheng Yan
Ming-Hsuan Yang
LRM
175
12
0
27 May 2024
Unifying 3D Vision-Language Understanding via Promptable Queries
Unifying 3D Vision-Language Understanding via Promptable Queries
Ziyu Zhu
Zhuofan Zhang
Xiaojian Ma
Xuesong Niu
Yixin Chen
Baoxiong Jia
Zhidong Deng
Siyuan Huang
Qing Li
110
32
0
19 May 2024
Grounded 3D-LLM with Referent Tokens
Grounded 3D-LLM with Referent Tokens
Yilun Chen
Shuai Yang
Haifeng Huang
Tai Wang
Ruiyuan Lyu
Runsen Xu
Dahua Lin
Jiangmiao Pang
108
37
0
16 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
132
21
0
16 May 2024
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language
  Models using 2D Priors
MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors
Yuan Tang
Xu Han
Xianzhi Li
Qiao Yu
Yixue Hao
Long Hu
Min Chen
88
20
0
02 May 2024
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal
  Model
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem
Jean Lahoud
Hisham Cholakkal
LRM
92
4
0
04 Apr 2024
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Bu Jin
Yupeng Zheng
Pengfei Li
Weize Li
Yuhang Zheng
...
Kun Zhan
Peng Jia
Xiaoxiao Long
Yilun Chen
Hao Zhao
3DV
114
20
0
28 Mar 2024
PointCloud-Text Matching: Benchmark Datasets and a Baseline
PointCloud-Text Matching: Benchmark Datasets and a Baseline
Yanglin Feng
Yang Qin
Dezhong Peng
Erik Cambria
Xi Peng
Peng Hu
90
1
0
28 Mar 2024
DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
Zhende Song
Chenchen Wang
Jiamu Sheng
C. Zhang
Gang Yu
Jiayuan Fan
Tao Chen
VGen
89
20
0
03 Mar 2024
The Revolution of Multimodal Large Language Models: A Survey
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni
Federico Cocchi
Luca Barsellotti
Nicholas Moratelli
Sara Sarto
Lorenzo Baraldi
Lorenzo Baraldi
Marcella Cornia
Rita Cucchiara
LRMVLM
137
64
0
19 Feb 2024
An Embodied Generalist Agent in 3D World
An Embodied Generalist Agent in 3D World
Jiangyong Huang
Silong Yong
Xiaojian Ma
Xiongkun Linghu
Puhao Li
Yan Wang
Qing Li
Song-Chun Zhu
Baoxiong Jia
Siyuan Huang
LM&Ro
118
176
0
18 Nov 2023
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest
Shilong Zhang
Pei Sun
Shoufa Chen
Min Xiao
Wenqi Shao
Wenwei Zhang
Yu Liu
Kai-xiang Chen
Ping Luo
MLLMVLM
168
238
0
07 Jul 2023
A Survey on Multimodal Large Language Models
A Survey on Multimodal Large Language Models
Shukang Yin
Chaoyou Fu
Sirui Zhao
Ke Li
Xing Sun
Tong Xu
Enhong Chen
MLLMLRM
140
613
0
23 Jun 2023
A Survey of Label-Efficient Deep Learning for 3D Point Clouds
A Survey of Label-Efficient Deep Learning for 3D Point Clouds
Aoran Xiao
Xiaoqin Zhang
Ling Shao
Shijian Lu
3DPC
109
24
0
31 May 2023
Previous
12