ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2211.16312
  4. Cited By
PLA: Language-Driven Open-Vocabulary 3D Scene Understanding
v1v2 (latest)

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

29 November 2022
Runyu Ding
Jihan Yang
Chuhui Xue
Wenqing Zhang
Song Bai
Xiaojuan Qi
    VLM
ArXiv (abs)PDFHTML

Papers citing "PLA: Language-Driven Open-Vocabulary 3D Scene Understanding"

50 / 122 papers shown
Title
FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
FreeQ-Graph: Free-form Querying with Semantic Consistent Scene Graph for 3D Scene Understanding
Chenlu Zhan
Gaoang Wang
Hongwei Wang
3DV
33
0
0
16 Jun 2025
OV-MAP : Open-Vocabulary Zero-Shot 3D Instance Segmentation Map for Robots
OV-MAP : Open-Vocabulary Zero-Shot 3D Instance Segmentation Map for Robots
Juno Kim
Yesol Park
Hye Jung Yoon
Byoung-Tak Zhang
78
0
0
13 Jun 2025
LEO-VL: Towards 3D Vision-Language Generalists via Data Scaling with Efficient Representation
J. Huang
Xiaojian Ma
Xiongkun Linghu
Yue Fan
Junchao He
...
Qing Li
Song-Chun Zhu
Yixin Chen
Baoxiong Jia
Siyuan Huang
86
0
0
11 Jun 2025
LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds
LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds
Zihui Zhang
Weisheng Dai
Hongtao Wen
Bo Yang
3DPC
28
0
0
09 Jun 2025
Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models
Hierarchical Question-Answering for Driving Scene Understanding Using Vision-Language Models
Safaa Abdullahi Moallim Mohamud
Minjin Baek
Dong Seog Han
56
0
0
03 Jun 2025
GraphPad: Inference-Time 3D Scene Graph Updates for Embodied Question Answering
GraphPad: Inference-Time 3D Scene Graph Updates for Embodied Question Answering
Muhammad Qasim Ali
Saeejith Nair
Alexander Wong
Yuchen Cui
Yuhao Chen
41
0
0
01 Jun 2025
RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation
RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation
Naman Patel
Prashanth Krishnamurthy
Farshad Khorrami
84
0
0
21 May 2025
Geofenced Unmanned Aerial Robotic Defender for Deer Detection and Deterrence (GUARD)
Geofenced Unmanned Aerial Robotic Defender for Deer Detection and Deterrence (GUARD)
Ebasa Temesgen
Mario Jerez
Greta Brown
Graham Wilson
Sree Ganesh Lalitaditya Divakarla
Sarah Boelter
Oscar Nelson
Robert McPherson
Maria Gini
75
0
0
16 May 2025
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation
NVSMask3D: Hard Visual Prompting with Camera Pose Interpolation for 3D Open Vocabulary Instance Segmentation
Junyuan Fang
Zihan Wang
Yanzhe Zhang
Shuzhe Wang
Iaroslav Melekhov
Arno Solin
VLM
90
0
0
20 Apr 2025
UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation
UKBOB: One Billion MRI Labeled Masks for Generalizable 3D Medical Image Segmentation
Emmanuelle Bourigault
A. Jamaludin
Abdullah Hamdi
86
1
0
09 Apr 2025
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
Can Zhang
G. Lee
3DV
116
0
0
08 Apr 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Yansen Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
185
5
0
28 Mar 2025
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
Hairong Yin
Huangying Zhan
Yi Tian Xu
Raymond A. Yeh
68
0
0
27 Mar 2025
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation
Jiaxin Huang
Runnan Chen
Ziwen Li
Zhengqing Gao
Xiao He
Yandong Guo
Mingming Gong
Tongliang Liu
LRM
115
1
0
23 Mar 2025
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
Yue Li
Qi Ma
Runyi Yang
Huapeng Li
Mengjiao Ma
...
E. Konukoglu
Theo Gevers
Luc Van Gool
Martin R. Oswald
Danda Pani Paudel
3DGSVLM
235
2
0
23 Mar 2025
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
Zhaochong An
Guolei Sun
Yun Liu
Runjia Li
Junlin Han
Ender Konukoglu
Serge Belongie
VLM
173
3
0
20 Mar 2025
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding
Jinlong Li
Cristiano Saltori
Fabio Poiesi
N. Sebe
496
2
0
20 Mar 2025
OSMa-Bench: Evaluating Open Semantic Mapping Under Varying Lighting Conditions
Maxim Popov
Regina Kurkova
Mikhail Iumanov
Jaafar Mahmoud
Sergey Kolyubin
99
0
0
13 Mar 2025
SAS: Segment Any 3D Scene with Integrated 2D Priors
Hao Sun
Jiahao Lu
Jiacheng Deng
Hanzhi Chang
Lifan Wu
Yanzhe Liang
Tianzhu Zhang
112
0
0
11 Mar 2025
GaussianGraph: 3D Gaussian-based Scene Graph Generation for Open-world Scene Understanding
Xihan Wang
Dianyi Yang
Yu Gao
Yufeng Yue
Yi Yang
M. Fu
3DGS
88
0
0
06 Mar 2025
Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration
Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration
Kim Jun-Seong
GeonU Kim
Kim Yu-Ji
Yu-Chun Wang
Jaesung Choe
Tae-Hyun Oh
3DGS
135
1
0
23 Feb 2025
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
Zekun Qi
Wenyao Zhang
Yufei Ding
Runpei Dong
Xinqiang Yu
...
Xin Jin
Kaisheng Ma
Zhizheng Zhang
He Wang
Li Yi
LM&Ro
213
7
0
18 Feb 2025
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring
Xinyi Wang
Na Zhao
Zhiyuan Han
Dan Guo
Xun Yang
88
1
0
17 Jan 2025
PanoSLAM: Panoptic 3D Scene Reconstruction via Gaussian SLAM
Runnan Chen
Zhaoqing Wang
Jiepeng Wang
Yuexin Ma
Mingming Gong
Wenping Wang
Tongliang Liu
3DGS
108
3
0
03 Jan 2025
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
Zhangyang Qi
Zhixiong Zhang
Ye Fang
Jiaqi Wang
Hengshuang Zhao
244
16
0
02 Jan 2025
Occam's LGS: An Efficient Approach for Language Gaussian Splatting
Occam's LGS: An Efficient Approach for Language Gaussian Splatting
Jiahuan Cheng
Jan-Nico Zaech
Luc Van Gool
Danda Pani Paudel
3DGS
165
0
0
02 Dec 2024
Scene Co-pilot: Procedural Text to Video Generation with Human in the
  Loop
Scene Co-pilot: Procedural Text to Video Generation with Human in the Loop
Zhaofang Qian
Abolfazl Sharifi
Tucker Carroll
Ser-Nam Lim
VGen
145
0
0
26 Nov 2024
ROOT: VLM based System for Indoor Scene Understanding and Beyond
ROOT: VLM based System for Indoor Scene Understanding and Beyond
Yonghui Wang
Shi-Yong Chen
Zhenxing Zhou
Siyi Li
Haoran Li
Wengang Zhou
Haoyang Li
VLM
149
3
0
24 Nov 2024
Training an Open-Vocabulary Monocular 3D Object Detection Model without
  3D Data
Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Rui Huang
Henry Zheng
Yan Wang
Zhuofan Xia
Marco Pavone
Gao Huang
3DPCVLM
139
1
0
23 Nov 2024
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
Yuncong Yang
Han Yang
Jiachen Zhou
Peihao Chen
Hongxin Zhang
Yilun Du
Chuang Gan
154
0
0
23 Nov 2024
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic
  Segmentation
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
Ziyi Wang
Yijiao Wang
Xumin Yu
Jie Zhou
Jiwen Lu
105
0
0
20 Nov 2024
SA3DIP: Segment Any 3D Instance with Potential 3D Priors
SA3DIP: Segment Any 3D Instance with Potential 3D Priors
Xi Yang
Xu Gu
Xingyilang Yin
Xinbo Gao
103
0
0
06 Nov 2024
ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from
  Only 2D Images
ImOV3D: Learning Open-Vocabulary Point Clouds 3D Object Detection from Only 2D Images
Timing Yang
Yuanliang Ju
Li Yi
3DPC
97
4
0
31 Oct 2024
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation
Zhaochong An
Guolei Sun
Yun Liu
Runjia Li
Min Wu
Ming-Ming Cheng
Ender Konukoglu
Serge Belongie
163
9
0
29 Oct 2024
Scene Graph Generation with Role-Playing Large Language Models
Scene Graph Generation with Role-Playing Large Language Models
Guikun Chen
Jin Li
Wenguan Wang
VLM
99
9
0
20 Oct 2024
Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features
Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features
Makram Chahine
Alex Quach
Alaa Maalouf
Tsun-Hsuan Wang
Daniela Rus
94
0
0
16 Oct 2024
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and
  Open-Vocabulary Semantic Scene Graphs
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Venkata Naren Devarakonda
Raktim Gautam Goswami
Ali Umut Kaypak
Naman Patel
Rooholla Khorrambakht
Prashanth Krishnamurthy
Farshad Khorrami
LM&Ro
103
7
0
08 Oct 2024
Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI
Multimodal 3D Fusion and In-Situ Learning for Spatially Aware AI
Chengyuan Xu
Radha Kumaran
Noah Stier
Kangyou Yu
Tobias Höllerer
71
1
0
06 Oct 2024
Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene
  Graph for Robot Navigation
Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation
Yifan Xu
Ziming Luo
Qianwei Wang
Vineet Kamat
Carol Menassa
3DV3DPC
60
2
0
16 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-Xiong Wang
151
23
0
05 Sep 2024
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online
  Grounding and Execution
EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution
F. Argenziano
Michele Brienza
Vincenzo Suriani
Daniele Nardi
D. Bloisi
LM&Ro
122
2
0
30 Aug 2024
Multimodal Foundational Models for Unsupervised 3D General Obstacle
  Detection
Multimodal Foundational Models for Unsupervised 3D General Obstacle Detection
Tamás Matuszka
Peter Hajas
Dávid Szeghy
77
0
0
22 Aug 2024
Positional Prompt Tuning for Efficient 3D Representation Learning
Positional Prompt Tuning for Efficient 3D Representation Learning
Shaochen Zhang
Zekun Qi
Runpei Dong
Xiuxiu Bai
Xing Wei
107
6
0
21 Aug 2024
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding
Youjun Zhao
Jiaying Lin
Shuquan Ye
Qianshi Pang
Rynson W. H. Lau
178
2
0
20 Aug 2024
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Vocabulary-Free 3D Instance Segmentation with Vision and Language Assistant
Guofeng Mei
Luigi Riz
Yiming Wang
Fabio Poiesi
ISegVLM
131
4
0
20 Aug 2024
Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D
  Instance Segmentation
Zero-Shot Dual-Path Integration Framework for Open-Vocabulary 3D Instance Segmentation
Tri Ton
Ji Woo Hong
Soohwan Eom
Jun Yeop Shim
Junyeong Kim
Chang D. Yoo
3DPCISeg
69
2
0
16 Aug 2024
Vision-Language Guidance for LiDAR-based Unsupervised 3D Object
  Detection
Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection
Christian Fruhwirth-Reisinger
Wei Lin
Dušan Malić
Horst Bischof
Horst Possegger
3DPC
70
1
0
07 Aug 2024
MILAN: Milli-Annotations for Lidar Semantic Segmentation
MILAN: Milli-Annotations for Lidar Semantic Segmentation
Nermin Samet
Gilles Puy
Oriane Siméoni
Renaud Marlet
3DPC
84
0
0
22 Jul 2024
OpenSU3D: Open World 3D Scene Understanding using Foundation Models
OpenSU3D: Open World 3D Scene Understanding using Foundation Models
Rafay Mohiuddin
Sai Manoj Prakhya
Fiona Collins
Ziyuan Liu
André Borrmann
56
2
0
19 Jul 2024
SegPoint: Segment Any Point Cloud via Large Language Model
SegPoint: Segment Any Point Cloud via Large Language Model
Shuting He
Henghui Ding
Xudong Jiang
Bihan Wen
3DVMLLM3DPC
92
19
0
18 Jul 2024
123
Next