ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.16312
  4. Cited By
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
v1v2 (latest)

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

29 November 2022
Runyu Ding
Jihan Yang
Chuhui Xue
Wenqing Zhang
Song Bai
Xiaojuan Qi
    VLM
ArXiv (abs)PDFHTML

Papers citing "PLA: Language-Driven Open-Vocabulary 3D Scene Understanding"

50 / 122 papers shown
Title
MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture
  Synthesis
MeshSegmenter: Zero-Shot Mesh Semantic Segmentation via Texture Synthesis
Ziming Zhong
Yanxu Xu
Jing Li
Jiale Xu
Zhengxin Li
Chaohui Yu
Shenghua Gao
3DV
120
3
0
18 Jul 2024
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion
  Models
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
Xiaoyu Zhu
Hao Zhou
Pengfei Xing
Long Zhao
Hao Xu
Junwei Liang
Alex Hauptmann
Ting Liu
Andrew C. Gallagher
DiffM
123
4
0
18 Jul 2024
Open Vocabulary 3D Scene Understanding via Geometry Guided
  Self-Distillation
Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation
Pengfei Wang
Yuxi Wang
Shuai Li
Zhaoxiang Zhang
Zhen Lei
Lei Zhang
113
3
0
18 Jul 2024
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Ruihuang Li
Zhengqiang Zhang
Chenhang He
Zhiyuan Ma
Vishal M. Patel
Lei Zhang
3DVVLM
98
6
0
13 Jul 2024
Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection
  Enhanced by Comprehensive Guidance from Text and Image
Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image
Pengkun Jiao
Na Zhao
Jingjing Chen
Yu-Gang Jiang
VLMObjD
72
3
0
07 Jul 2024
3D Feature Distillation with Object-Centric Priors
3D Feature Distillation with Object-Centric Priors
Georgios Tziafas
Yucheng Xu
Zhibin Li
Hamidreza Kasaei
94
1
0
26 Jun 2024
Situational Awareness Matters in 3D Vision Language Reasoning
Situational Awareness Matters in 3D Vision Language Reasoning
Yunze Man
Liang-Yan Gui
Yu-Xiong Wang
91
18
0
11 Jun 2024
OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary
  Understanding
OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding
Y. Wu
Jiarui Meng
Haijie Li
Chenming Wu
Yahao Shi
...
Chen Zhao
Haocheng Feng
Errui Ding
Jingdong Wang
Jian Zhang
3DGS3DPC
96
35
0
04 Jun 2024
Collaborative Novel Object Discovery and Box-Guided Cross-Modal
  Alignment for Open-Vocabulary 3D Object Detection
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection
Yang Cao
Yihan Zeng
Hang Xu
Dan Xu
3DPCObjD
80
6
0
02 Jun 2024
Open-Vocabulary SAM3D: Understand Any 3D Scene
Open-Vocabulary SAM3D: Understand Any 3D Scene
Hanchen Tai
Qingdong He
Jiangning Zhang
Yijie Qian
Zhenyu Zhang
Xiaobin Hu
Yabiao Wang
Yong Liu
VLM
124
0
0
24 May 2024
Unifying 3D Vision-Language Understanding via Promptable Queries
Unifying 3D Vision-Language Understanding via Promptable Queries
Ziyu Zhu
Zhuofan Zhang
Xiaojian Ma
Xuesong Niu
Yixin Chen
Baoxiong Jia
Zhidong Deng
Siyuan Huang
Qing Li
118
32
0
19 May 2024
Grounded 3D-LLM with Referent Tokens
Grounded 3D-LLM with Referent Tokens
Yilun Chen
Shuai Yang
Haifeng Huang
Tai Wang
Ruiyuan Lyu
Runsen Xu
Dahua Lin
Jiangmiao Pang
115
37
0
16 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks
  via Multi-modal Large Language Models
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
132
21
0
16 May 2024
Probing Multimodal LLMs as World Models for Driving
Probing Multimodal LLMs as World Models for Driving
Shiva Sreeram
Tsun-Hsuan Wang
Alaa Maalouf
Guy Rosman
S. Karaman
Daniela Rus
91
10
0
09 May 2024
Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation
  in Operating Rooms
Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
Diandian Guo
Manxi Lin
Jialun Pei
He Tang
Yueming Jin
Pheng-Ann Heng
72
2
0
14 Apr 2024
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal
  Model
PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model
Amrin Kareem
Jean Lahoud
Hisham Cholakkal
LRM
92
4
0
04 Apr 2024
Segment Any 3D Object with Language
Segment Any 3D Object with Language
Seungjun Lee
Yuyang Zhao
Gim Hee Lee
79
1
0
02 Apr 2024
GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields
GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields
Yunsong Wang
Hanlin Chen
Gim Hee Lee
124
6
0
01 Apr 2024
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via
  Cycle-Modality Propagation
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation
Zhenyu Wang
Yali Li
Taichi Liu
Hengshuang Zhao
Shengjin Wang
3DPCObjD
102
8
0
28 Mar 2024
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian
  Splatting
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting
Jun Guo
Xiaojian Ma
Yue Fan
Huaping Liu
Qing Li
3DGS
116
31
0
22 Mar 2024
Can 3D Vision-Language Models Truly Understand Natural Language?
Can 3D Vision-Language Models Truly Understand Natural Language?
Weipeng Deng
Jihan Yang
Runyu Ding
Jiahui Liu
Yijiang Li
Xiaojuan Qi
Edith C.H. Ngai
123
6
0
21 Mar 2024
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields
N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields
Yash Bhalgat
Iro Laina
João F. Henriques
Andrew Zisserman
Andrea Vedaldi
97
17
0
16 Mar 2024
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via
  Foundation Models
PointSeg: A Training-Free Paradigm for 3D Scene Segmentation via Foundation Models
Qingdong He
Jinlong Peng
Zhengkai Jiang
Xiaobin Hu
Jiangning Zhang
Qiang Nie
Yabiao Wang
Chengjie Wang
3DPCVLM
95
5
0
11 Mar 2024
Self-Adapting Large Visual-Language Models to Edge Devices across Visual
  Modalities
Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities
Kaiwen Cai
Zhekai Duan
Gaowen Liu
Charles Fleming
Chris Xiaoxuan Lu
VLM
89
4
0
07 Mar 2024
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Zekun Qi
Runpei Dong
Shaochen Zhang
Haoran Geng
Chunrui Han
Zheng Ge
Li Yi
Kaisheng Ma
206
63
0
27 Feb 2024
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene
  Understanding
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding
Francis Engelmann
Ayca Takmaz
Jonas Schult
Elisabetta Fedele
Johanna Wald
...
Xiaoyang Wu
Xi Chen
Hengshuang Zhao
Lei Zhu
Joan Lasenby
94
3
0
23 Feb 2024
V-IRL: Grounding Virtual Intelligence in Real Life
V-IRL: Grounding Virtual Intelligence in Real Life
Jihan Yang
Runyu Ding
Ellis L Brown
Xiaojuan Qi
Saining Xie
LM&Ro
121
22
0
05 Feb 2024
CLIP Can Understand Depth
CLIP Can Understand Depth
Dunam Kim
Seokju Lee
VLMMDE
121
2
0
05 Feb 2024
UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with
  Fine-Grained Feature Representation
UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation
Qingdong He
Jinlong Peng
Zhengkai Jiang
Kai Wu
Xiaozhong Ji
Jiangning Zhang
Yabiao Wang
Chengjie Wang
Mingang Chen
Yunsheng Wu
3DPC
62
8
0
21 Jan 2024
ODIN: A Single Model for 2D and 3D Segmentation
ODIN: A Single Model for 2D and 3D Segmentation
Ayush Jain
Pushkal Katara
N. Gkanatsios
Adam W. Harley
Gabriel H. Sarch
Kriti Aggarwal
Vishrav Chaudhary
Katerina Fragkiadaki
3DPC
133
9
0
04 Jan 2024
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language
  Distillation
3D Open-Vocabulary Panoptic Segmentation with 2D-3D Vision-Language Distillation
Zihao Xiao
Longlong Jing
Shangxuan Wu
Alex Zihao Zhu
Jingwei Ji
...
Thomas Funkhouser
Weicheng Kuo
A. Angelova
Yin Zhou
Shiwei Sheng
VLM
124
6
0
04 Jan 2024
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without
  Manual Labels
Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels
Rui Huang
Songyou Peng
Ayca Takmaz
Federico Tombari
Marc Pollefeys
Shiji Song
Gao Huang
Francis Engelmann
VLM
110
40
0
28 Dec 2023
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards
  Embodied AI
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Tai Wang
Xiaohan Mao
Chenming Zhu
Runsen Xu
Ruiyuan Lyu
...
Tianfan Xue
Xihui Liu
Cewu Lu
Dahua Lin
Jiangmiao Pang
LM&Ro
120
74
0
26 Dec 2023
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance
P. Nguyen
T.D. Ngo
E. Kalogerakis
Chuang Gan
Anh Tran
Cuong Pham
Khoi Duc Minh Nguyen
ISeg
154
55
0
17 Dec 2023
SAM-guided Graph Cut for 3D Instance Segmentation
SAM-guided Graph Cut for 3D Instance Segmentation
Haoyu Guo
He Zhu
Sida Peng
Yuang Wang
Yujun Shen
Ruizhen Hu
Xiaowei Zhou
3DV
104
18
0
13 Dec 2023
Segment Any 3D Gaussians
Segment Any 3D Gaussians
Jiazhong Cen
Jiemin Fang
Chen Yang
Lingxi Xie
Xiaopeng Zhang
Wei Shen
Qi Tian
3DGS
178
76
0
01 Dec 2023
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding,
  Reasoning, and Planning
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Sijin Chen
Xin Chen
C. Zhang
Mingsheng Li
Gang Yu
Hao Fei
Erik Cambria
Jiayuan Fan
Tao Chen
MLLM
103
102
0
30 Nov 2023
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Language Embedded 3D Gaussians for Open-Vocabulary Scene Understanding
Jin-Chuan Shi
Miao Wang
Hao-Bin Duan
Shao-Hua Guan
3DGS
104
96
0
30 Nov 2023
DAE-Net: Deforming Auto-Encoder for fine-grained shape co-segmentation
DAE-Net: Deforming Auto-Encoder for fine-grained shape co-segmentation
Zhiqin Chen
Qimin Chen
Hang Zhou
Hao Zhang
3DPC3DV
103
3
0
22 Nov 2023
OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D
  Data
OVIR-3D: Open-Vocabulary 3D Instance Retrieval Without Training on 3D Data
Shiyang Lu
Haonan Chang
E. Jing
Abdeslam Boularias
Kostas Bekris
93
58
0
06 Nov 2023
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D
  Pre-training
Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training
Yipeng Gao
Zeyu Wang
Wei-Shi Zheng
Cihang Xie
Yuyin Zhou
3DPC
153
10
0
03 Nov 2023
Drive Anywhere: Generalizable End-to-end Autonomous Driving with
  Multi-modal Foundation Models
Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models
Tsun-Hsuan Wang
Alaa Maalouf
Wei Xiao
Yutong Ban
Alexander Amini
Guy Rosman
S. Karaman
Daniela Rus
73
46
0
26 Oct 2023
Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph
  prediction
Lang3DSG: Language-based contrastive pre-training for 3D Scene Graph prediction
Sebastian Koch
Pedro Hermosilla
Narunas Vaskevicius
Mirco Colosi
Timo Ropinski
98
11
0
25 Oct 2023
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive
  Survey and Evaluation
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation
Yinjie Lei
Zixuan Wang
Feng Chen
Guoqing Wang
Peng Wang
Yang Yang
107
12
0
24 Oct 2023
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V
Jianwei Yang
Hao Zhang
Feng Li
Xueyan Zou
Chun-yue Li
Jianfeng Gao
MLLMVLM
136
189
0
17 Oct 2023
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for
  Open-vocabulary 3D Object Detection
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection
Yang Cao
Yihan Zeng
Hang Xu
Dan Xu
3DPCObjD
97
34
0
04 Oct 2023
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and
  Planning
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Yuanyi Zhong
Alihusein Kuwajerwala
Sacha Morin
Krishna Murthy Jatavallabhula
Bipasha Sen
...
Celso Miguel de Melo
Joshua B. Tenenbaum
Antonio Torralba
Florian Shkurti
Liam Paull
LM&Ro
142
189
0
28 Sep 2023
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language
  Model as an Agent
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
Jianing Yang
Xuweiyi Chen
Shengyi Qian
Nikhil Madaan
Madhavan Iyengar
David Fouhey
Joyce Chai
LM&RoLLMAG
152
101
0
21 Sep 2023
Object2Scene: Putting Objects in Context for Open-Vocabulary 3D
  Detection
Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection
Chenming Zhu
Wenwei Zhang
Tai Wang
Xihui Liu
Kai-xiang Chen
3DPC
90
18
0
18 Sep 2023
Find What You Want: Learning Demand-conditioned Object Attribute Space
  for Demand-driven Navigation
Find What You Want: Learning Demand-conditioned Object Attribute Space for Demand-driven Navigation
Hongchen Wang
Andy Guan Hong Chen
Xiaoqi Li
Mingdong Wu
Hao Dong
69
16
0
15 Sep 2023
Previous
123
Next