Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.08830
Cited By
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
18 December 2019
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language"
50 / 238 papers shown
Title
OrionNav: Online Planning for Robot Autonomy with Context-Aware LLM and Open-Vocabulary Semantic Scene Graphs
Venkata Naren Devarakonda
Raktim Gautam Goswami
Ali Umut Kaypak
Naman Patel
Rooholla Khorrambakht
Prashanth Krishnamurthy
Farshad Khorrami
LM&Ro
39
3
0
08 Oct 2024
In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding
Shenghao Li
40
1
0
06 Oct 2024
The Wallpaper is Ugly: Indoor Localization using Vision and Language
Seth Pate
Lawson L. S. Wong
33
0
0
04 Oct 2024
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models
Yue Zhang
Zhiyang Xu
Ying Shen
Parisa Kordjamshidi
Lifu Huang
34
6
0
04 Oct 2024
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Chenming Zhu
Tai Wang
Wenwei Zhang
Jiangmiao Pang
Xihui Liu
134
32
0
26 Sep 2024
ChatCam: Empowering Camera Control through Conversational AI
Xinhang Liu
Yu-Wing Tai
Chi-Keung Tang
VGen
33
2
0
25 Sep 2024
SYNERGAI: Perception Alignment for Human-Robot Collaboration
Yixin Chen
Guoxi Zhang
Yaowei Zhang
Hongming Xu
Peiyuan Zhi
Qing Li
Siyuan Huang
37
0
0
24 Sep 2024
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
Zuyan Liu
Yuhao Dong
Ziwei Liu
Winston Hu
Jiwen Lu
Yongming Rao
ObjD
86
55
0
19 Sep 2024
HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models
V. Bhat
Prashanth Krishnamurthy
Ramesh Karri
Farshad Khorrami
46
4
0
16 Sep 2024
QueryCAD: Grounded Question Answering for CAD Models
Claudius Kienle
Benjamin Alt
Darko Katic
Rainer Jäkel
Jan Peters
31
2
0
13 Sep 2024
Bayesian Self-Training for Semi-Supervised 3D Segmentation
Ozan Unal
Daniel Gehrig
Luc Van Gool
3DPC
3DV
37
0
0
12 Sep 2024
Towards Energy-Efficiency by Navigating the Trilemma of Energy, Latency, and Accuracy
Boyuan Tian
Yihan Pang
Muhammad Huzaifa
Shenlong Wang
Sarita Adve
34
1
0
06 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-xiong Wang
78
15
0
05 Sep 2024
Multi-modal Situated Reasoning in 3D Scenes
Xiongkun Linghu
Jiangyong Huang
Xuesong Niu
Xiaojian Ma
Baoxiong Jia
Siyuan Huang
39
12
0
04 Sep 2024
Space3D-Bench: Spatial 3D Question Answering Benchmark
E. Szymańska
Mihai Dusmanu
J. Buurlage
Mahdi Rad
Marc Pollefeys
59
4
0
29 Aug 2024
AeroVerse: UAV-Agent Benchmark Suite for Simulating, Pre-training, Finetuning, and Evaluating Aerospace Embodied World Models
Fanglong Yao
Yuanchang Yue
Youzhi Liu
Xian Sun
Kun Fu
VGen
EgoV
29
6
0
28 Aug 2024
R2G: Reasoning to Ground in 3D Scenes
Yixuan Li
Zan Wang
Wei Liang
46
2
0
24 Aug 2024
Open-Ended 3D Point Cloud Instance Segmentation
Phuc D. A. Nguyen
Minh Luu
Anh Tran
Cuong Pham
Khoi Nguyen
3DPC
56
1
0
21 Aug 2024
See It All: Contextualized Late Aggregation for 3D Dense Captioning
Minjung Kim
Hyung Suk Lim
Seung Hwan Kim
Soonyoung Lee
Bumsoo Kim
Gunhee Kim
55
4
0
14 Aug 2024
Bi-directional Contextual Attention for 3D Dense Captioning
Minjung Kim
Hyung Suk Lim
Soonyoung Lee
Bumsoo Kim
Gunhee Kim
43
3
0
13 Aug 2024
3D-GRES: Generalized 3D Referring Expression Segmentation
Changli Wu
Yihang Liu
Jiayi Ji
Yiwei Ma
Haowei Wang
Gen Luo
Henghui Ding
Xiaoshuai Sun
Rongrong Ji
42
7
0
30 Jul 2024
Answerability Fields: Answerable Location Estimation via Diffusion Models
Daich Azuma
Taiki Miyanishi
Shuhei Kurita
Koya Sakamoto
M. Kawanabe
DiffM
48
0
0
26 Jul 2024
RefMask3D: Language-Guided Transformer for 3D Referring Segmentation
Shuting He
Henghui Ding
61
10
0
25 Jul 2024
SegPoint: Segment Any Point Cloud via Large Language Model
Shuting He
Henghui Ding
Xudong Jiang
Bihan Wen
3DV
MLLM
3DPC
48
19
0
18 Jul 2024
GRUtopia: Dream General Robots in a City at Scale
Hanqing Wang
Jiahe Chen
Wensi Huang
Qingwei Ben
Tai Wang
...
Ying Zhao
Zhongying Tu
Yu Qiao
Dahua Lin
Jiangmiao Pang
LM&Ro
VGen
57
16
0
15 Jul 2024
Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Ruihuang Li
Zhengqiang Zhang
Chenhang He
Zhiyuan Ma
Vishal M. Patel
Lei Zhang
3DV
VLM
42
5
0
13 Jul 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Yang Liu
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&Ro
SyDa
AI4CE
51
50
0
09 Jul 2024
3D Vision and Language Pretraining with Large-Scale Synthetic Data
Dejie Yang
Zhu Xu
Wentao Mo
Qingchao Chen
Siyuan Huang
Yang Liu
24
5
0
08 Jul 2024
Multi-branch Collaborative Learning Network for 3D Visual Grounding
Zhipeng Qian
Yiwei Ma
Zhekai Lin
Jiayi Ji
Xiawu Zheng
Xiaoshuai Sun
Rongrong Ji
3DV
46
4
0
07 Jul 2024
A Unified Framework for 3D Scene Understanding
Wei Xu
Chunsheng Shi
Sifan Tu
Xin Zhou
Dingkang Liang
Xiang Bai
VOS
34
5
0
03 Jul 2024
Multi-Task Domain Adaptation for Language Grounding with 3D Objects
Penglei Sun
Yaoxian Song
Xinglin Pan
Peijie Dong
Xiaofei Yang
Qiang-qiang Wang
Zhixu Li
Tiefeng Li
Xiaowen Chu
70
1
0
03 Jul 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu
Tai Wang
Wenwei Zhang
Kai Chen
Xihui Liu
ReLM
LRM
45
16
0
01 Jul 2024
3D Feature Distillation with Object-Centric Priors
Georgios Tziafas
Yucheng Xu
Zhibin Li
H. Kasaei
36
1
0
26 Jun 2024
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations
Ruiyuan Lyu
Tai Wang
Jingli Lin
Shuai Yang
Xiaohan Mao
...
Runsen Xu
Haifeng Huang
Chenming Zhu
Dahua Lin
Jiangmiao Pang
3DV
49
11
0
13 Jun 2024
Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding
Yue Xu
Kaizhi Yang
Jiebo Luo
Xuejin Chen
3DPC
45
1
0
13 Jun 2024
Situational Awareness Matters in 3D Vision Language Reasoning
Yunze Man
Liang-Yan Gui
Yu-Xiong Wang
43
12
0
11 Jun 2024
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graph
S. Linok
T. Zemskova
Svetlana Ladanova
Roman Titkov
Dmitry A. Yudin
Maxim Monastyrny
Aleksei Valenkov
LM&Ro
57
3
0
11 Jun 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
40
9
0
09 Jun 2024
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing Yang
Xuweiyi Chen
Nikhil Madaan
Madhavan Iyengar
Shengyi Qian
David Fouhey
Joyce Chai
3DV
78
11
0
07 Jun 2024
Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding
Junjie Fei
Mahmoud Ahmed
Jian Ding
Eslam Mohamed Bakr
Mohamed Elhoseiny
36
3
0
29 May 2024
Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention
Weitai Kang
Mengxue Qu
Jyoti Kini
Yunchao Wei
Mubarak Shah
Yan Yan
LM&Ro
3DPC
53
10
0
28 May 2024
Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model
Kuan-Chih Huang
Xiangtai Li
Lu Qi
Shuicheng Yan
Ming-Hsuan Yang
LRM
76
10
0
27 May 2024
Talk to Parallel LiDARs: A Human-LiDAR Interaction Method Based on 3D Visual Grounding
Yuhang Liu
Boyi Sun
Guixu Zheng
Yishuo Wang
Jing Wang
Fei-Yue Wang
42
2
0
24 May 2024
Unifying 3D Vision-Language Understanding via Promptable Queries
Ziyu Zhu
Zhuofan Zhang
Xiaojian Ma
Xuesong Niu
Yixin Chen
Baoxiong Jia
Zhidong Deng
Siyuan Huang
Qing Li
48
21
0
19 May 2024
Grounded 3D-LLM with Referent Tokens
Yilun Chen
Shuai Yang
Haifeng Huang
Tai Wang
Ruiyuan Lyu
Runsen Xu
Dahua Lin
Jiangmiao Pang
53
23
0
16 May 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Xianzheng Ma
Yash Bhalgat
Brandon Smart
Shuai Chen
Xinghui Li
...
Matthias Nießner
Ian D Reid
Angel X. Chang
Iro Laina
V. Prisacariu
LRM
33
13
0
16 May 2024
Generating Human Motion in 3D Scenes from Text Descriptions
Zhi Cen
Huaijin Pi
Sida Peng
Zehong Shen
Minghui Yang
Shuai Zhu
Hujun Bao
Xiaowei Zhou
50
19
0
13 May 2024
Tactile-Augmented Radiance Fields
Yiming Dou
Fengyu Yang
Yi Liu
Antonio Loquercio
Andrew Owens
36
18
0
07 May 2024
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners
Chun Feng
Joy Hsu
Weiyu Liu
Jiajun Wu
PINN
LRM
46
6
0
30 Apr 2024
Transcrib3D: 3D Referring Expression Resolution through Large Language Models
Jiading Fang
Xiangshan Tan
Shengjie Lin
Igor Vasiljevic
Vitor Campagnolo Guizilini
Hongyuan Mei
Rares Andrei Ambrus
Gregory Shakhnarovich
Matthew R. Walter
LM&Ro
41
4
0
30 Apr 2024
Previous
1
2
3
4
5
Next