Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.08830
Cited By
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
18 December 2019
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language"
50 / 238 papers shown
Title
Instance-free Text to Point Cloud Localization with Relative Position Awareness
Lichao Wang
Zhihao Yuan
Jinke Ren
Shuguang Cui
Zhen Li
44
0
0
27 Apr 2024
Think-Program-reCtify: 3D Situated Reasoning with Large Language Models
Qingrong He
Kejun Lin
Shizhe Chen
Anwen Hu
Qin Jin
LRM
45
1
0
23 Apr 2024
"Where am I?" Scene Retrieval with Language
Jiaqi Chen
Dániel Baráth
Iro Armeni
Marc Pollefeys
Hermann Blum
LM&Ro
58
5
0
22 Apr 2024
Unified Scene Representation and Reconstruction for 3D Large Language Models
Tao Chu
Pan Zhang
Xiao-wen Dong
Yuhang Zang
Qiong Liu
Jiaqi Wang
37
1
0
19 Apr 2024
Contrastive Gaussian Clustering: Weakly Supervised 3D Scene Segmentation
Myrna C. Silva
Mahtab Dahaghin
M. Toso
Alessio Del Bue
3DGS
37
11
0
19 Apr 2024
Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
Yongdong Luo
Haojia Lin
Xiawu Zheng
Yigeng Jiang
Rongrong Ji
Jie Hu
Guannan Jiang
Songan Zhang
Rongrong Ji
26
0
0
17 Apr 2024
Weakly-Supervised 3D Scene Graph Generation via Visual-Linguistic Assisted Pseudo-labeling
Xu Wang
Yifan Li
Qiudan Zhang
Wen-Bin Wu
Mark Junjie Li
Jianmin Jinag
51
1
0
03 Apr 2024
Segment Any 3D Object with Language
Seungjun Lee
Yuyang Zhao
Gim Hee Lee
44
1
0
02 Apr 2024
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Bu Jin
Yupeng Zheng
Pengfei Li
Weize Li
Yuhang Zheng
...
Kun Zhan
Peng Jia
Xiaoxiao Long
Yilun Chen
Hao Zhao
3DV
79
15
0
28 Mar 2024
PointCloud-Text Matching: Benchmark Datasets and a Baseline
Yanglin Feng
Yang Qin
Dezhong Peng
Erik Cambria
Xi Peng
Peng Hu
50
1
0
28 Mar 2024
Data-Efficient 3D Visual Grounding via Order-Aware Referring
Tung-Yu Wu
Sheng-Yu Huang
Yu-Chiang Frank Wang
34
0
0
25 Mar 2024
Can 3D Vision-Language Models Truly Understand Natural Language?
Weipeng Deng
Jihan Yang
Runyu Ding
Jiahui Liu
Yijiang Li
Xiaojuan Qi
Edith C.H. Ngai
39
4
0
21 Mar 2024
Agent3D-Zero: An Agent for Zero-shot 3D Understanding
Sha Zhang
Di Huang
Jiajun Deng
Shixiang Tang
Wanli Ouyang
Tong He
Yanyong Zhang
VGen
46
14
0
18 Mar 2024
Scene-LLM: Extending Language Model for 3D Visual Understanding and Reasoning
Rao Fu
Jingyu Liu
Xilun Chen
Yixin Nie
Wenhan Xiong
LM&Ro
LRM
49
51
0
18 Mar 2024
GaussianGrasper: 3D Language Gaussian Splatting for Open-vocabulary Robotic Grasping
Yuhang Zheng
Xiangyu Chen
Yupeng Zheng
Songen Gu
Runyi Yang
...
Chao Yang
Dawei Wang
Zhen Chen
Xiaoxiao Long
Meiqing Wang
58
43
0
14 Mar 2024
SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention
Feng Xiao
Hongbin Xu
Qiuxia Wu
Wenxiong Kang
34
2
0
13 Mar 2024
A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing Objects in 3D Scenes
Ting Yu
Xiaojun Lin
Shuhui Wang
Weiguo Sheng
Qingming Huang
Jun-chen Yu
3DV
54
10
0
12 Mar 2024
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Zekun Qi
Runpei Dong
Shaochen Zhang
Haoran Geng
Chunrui Han
Zheng Ge
Li Yi
Kaisheng Ma
41
52
0
27 Feb 2024
OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding
Francis Engelmann
Ayca Takmaz
Jonas Schult
Elisabetta Fedele
Johanna Wald
...
Xiaoyang Wu
Xi Chen
Hengshuang Zhao
Lei Zhu
Joan Lasenby
44
3
0
23 Feb 2024
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections
Chen Dudai
Morris Alper
Hana Bezalel
Rana Hanocka
Itai Lang
Hadar Averbuch-Elor
23
2
0
14 Feb 2024
3DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding
Zeju Li
Chao Zhang
Xiaoyan Wang
Ruilong Ren
Yifan Xu
Ruifei Ma
Xiangde Liu
MLLM
21
20
0
06 Jan 2024
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Tai Wang
Xiaohan Mao
Chenming Zhu
Runsen Xu
Ruiyuan Lyu
...
Tianfan Xue
Xihui Liu
Cewu Lu
Dahua Lin
Jiangmiao Pang
LM&Ro
37
60
0
26 Dec 2023
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding
Senqiao Yang
Jiaming Liu
Ray Zhang
Mingjie Pan
Zoey Guo
Xiaoqi Li
Zehui Chen
Peng Gao
Yandong Guo
Shanghang Zhang
3DV
26
58
0
21 Dec 2023
M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts
Mingsheng Li
Xin Chen
C. Zhang
Sijin Chen
Erik Cambria
Fukun Yin
Gang Yu
Tao Chen
31
24
0
17 Dec 2023
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment
Xiaoxu Xu
Yitian Yuan
Qiudan Zhang
Wen-Bin Wu
Zequn Jie
Lin Ma
Xu Wang
61
4
0
15 Dec 2023
Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers
Haifeng Huang
Zehan Wang
Rongjie Huang
Luping Liu
Xize Cheng
Yang Zhao
Tao Jin
Zhou Zhao
61
46
0
13 Dec 2023
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan
Yuan. Yuan
Zhitong Xiong
MDE
36
9
0
13 Dec 2023
Uni3DL: Unified Model for 3D and Language Understanding
Xiang Li
Jian Ding
Zhaoyang Chen
Mohamed Elhoseiny
38
3
0
05 Dec 2023
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning
Sijin Chen
Xin Chen
C. Zhang
Mingsheng Li
Gang Yu
Hao Fei
Erik Cambria
Jiayuan Fan
Tao Chen
MLLM
29
79
0
30 Nov 2023
SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors
Dave Zhenyu Chen
Haoxuan Li
Hsin-Ying Lee
Sergey Tulyakov
Matthias Nießner
DiffM
27
28
0
28 Nov 2023
Text2Loc: 3D Point Cloud Localization from Natural Language
Yan Xia
Letian Shi
Zifeng Ding
João F. Henriques
Daniel Cremers
32
25
0
27 Nov 2023
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
Zhihao Yuan
Jinke Ren
Chun-Mei Feng
Hengshuang Zhao
Shuguang Cui
Zhen Li
39
26
0
26 Nov 2023
An Embodied Generalist Agent in 3D World
Jiangyong Huang
Silong Yong
Xiaojian Ma
Xiongkun Linghu
Puhao Li
Yan Wang
Qing Li
Song-Chun Zhu
Baoxiong Jia
Siyuan Huang
LM&Ro
31
139
0
18 Nov 2023
Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter
Georgios Tziafas
Yucheng Xu
Arushi Goel
M. Kasaei
Zhibin Li
H. Kasaei
32
23
0
09 Nov 2023
Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture
Yixin Chen
Junfeng Ni
Nan Jiang
Yaowei Zhang
Yixin Zhu
Siyuan Huang
3DV
30
21
0
01 Nov 2023
Generating Context-Aware Natural Answers for Questions in 3D Scenes
Mohammed Munzer Dwedari
Matthias Niessner
Dave Zhenyu Chen
27
1
0
30 Oct 2023
CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data
Taiki Miyanishi
Fumiya Kitamori
Shuhei Kurita
Jungdae Lee
M. Kawanabe
Nakamasa Inoue
AI4TS
3DPC
17
4
0
28 Oct 2023
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation
Yinjie Lei
Zixuan Wang
Feng Chen
Guoqing Wang
Peng Wang
Yang Yang
34
10
0
24 Oct 2023
Extending Multi-modal Contrastive Representations
Zehan Wang
Ziang Zhang
Luping Liu
Yang Zhao
Haifeng Huang
Tao Jin
Zhou Zhao
26
5
0
13 Oct 2023
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
Eslam Mohamed Bakr
Mohamed Ayman
Mahmoud Ahmed
Habib Slim
Mohamed Elhoseiny
LRM
28
12
0
10 Oct 2023
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving
Tushar Choudhary
Vikrant Dewangan
Shivam Chandhok
Shubham Priyadarshan
Anushka Jain
A. K. Singh
Siddharth Srivastava
Krishna Murthy Jatavallabhula
K. M. Krishna
50
58
0
03 Oct 2023
PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation
Shizhe Chen
Ricardo Garcia Pinel
Cordelia Schmid
Ivan Laptev
LM&Ro
3DPC
30
34
0
27 Sep 2023
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
Jianing Yang
Xuweiyi Chen
Shengyi Qian
Nikhil Madaan
Madhavan Iyengar
David Fouhey
Joyce Chai
LM&Ro
LLMAG
43
84
0
21 Sep 2023
Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection
Chenming Zhu
Wenwei Zhang
Tai Wang
Xihui Liu
Kai-xiang Chen
3DPC
41
18
0
18 Sep 2023
Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning
Enna Sachdeva
Nakul Agarwal
Suhas Chundi
Sean Roelofs
Jiachen Li
Mykel Kochenderfer
Chiho Choi
Behzad Dariush
33
47
0
12 Sep 2023
Multi3DRefer: Grounding Text Description to Multiple 3D Objects
Yiming Zhang
ZeMing Gong
Angel X. Chang
47
63
0
11 Sep 2023
Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding
Ozan Unal
Daniel Gehrig
Suman Saha
Luc Van Gool
36
12
0
08 Sep 2023
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
Ali Abdari
Alex Falcon
Giuseppe Serra
34
2
0
06 Sep 2023
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Sijin Chen
Erik Cambria
Mingsheng Li
Xin Chen
Peng Guo
Yinjie Lei
Gang Yu
Taihao Li
Tao Chen
19
18
0
06 Sep 2023
Dense Object Grounding in 3D Scenes
Wencan Huang
Daizong Liu
Wei Hu
13
17
0
05 Sep 2023
Previous
1
2
3
4
5
Next