Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.02174
Cited By
Multi-View Transformer for 3D Visual Grounding
5 April 2022
Shijia Huang
Yilun Chen
Jiaya Jia
Liwei Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-View Transformer for 3D Visual Grounding"
50 / 83 papers shown
Title
SpatialPrompting: Keyframe-driven Zero-Shot Spatial Reasoning with Off-the-Shelf Multimodal Large Language Models
Shun Taguchi
Hideki Deguchi
Takumi Hamazaki
Hiroyuki Sakai
ReLM
LRM
49
0
0
08 May 2025
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding
Henry Zheng
Hao Shi
Qihang Peng
Yong Xien Chng
Rui Huang
Yepeng Weng
Zhongchao Shi
Gao Huang
74
1
0
08 May 2025
AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding
Feng Xiao
Hongbin Xu
Guocan Zhao
Wenxiong Kang
50
0
0
07 May 2025
3DWG: 3D Weakly Supervised Visual Grounding via Category and Instance-Level Alignment
Xianrui Li
Jing Liu
Nuowei Han
Liang Heng
Y. Guo
Hao Dong
Yang Liu
71
0
0
03 May 2025
SORT3D: Spatial Object-centric Reasoning Toolbox for Zero-Shot 3D Grounding Using Large Language Models
Nader Zantout
Haochen Zhang
Pujith Kachana
J. Qiu
Ji Zhang
Wenshan Wang
LM&Ro
LRM
147
0
0
25 Apr 2025
Multi-Object Grounding via Hierarchical Contrastive Siamese Transformers
Chengyi Du
Keyan Jin
29
0
0
14 Apr 2025
Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
Haochen Wang
Yucheng Zhao
Tiancai Wang
Haoqiang Fan
Xinming Zhang
Zhaoxiang Zhang
66
0
0
02 Apr 2025
ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning
Zhenyang Liu
Yikai Wang
Sixiao Zheng
Tongying Pan
Longfei Liang
Yanwei Fu
Xiangyang Xue
LRM
54
0
0
30 Mar 2025
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Jiahui Zhang
Yurui Chen
Yanpeng Zhou
Yueming Xu
Ze Huang
...
Xinyue Cai
G. Huang
Xingyue Quan
Hang Xu
Li Zhang
LRM
94
0
0
29 Mar 2025
Empowering Large Language Models with 3D Situation Awareness
Zhihao Yuan
Yibo Peng
Jinke Ren
Yinghong Liao
Yatong Han
Chun-Mei Feng
Hengshuang Zhao
G. Li
Shuguang Cui
Zhen Li
51
0
0
29 Mar 2025
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
J. Huang
Baoxiong Jia
Yixuan Wang
Ziyu Zhu
Xiongkun Linghu
Qing Li
Song-Chun Zhu
Siyuan Huang
84
3
0
28 Mar 2025
Open-Vocabulary Functional 3D Scene Graphs for Real-World Indoor Spaces
Chenyangguang Zhang
Alexandros Delitzas
Fangjinhua Wang
Ruida Zhang
Xiangyang Ji
Marc Pollefeys
Francis Engelmann
3DV
3DPC
49
4
0
24 Mar 2025
IRef-VLA: A Benchmark for Interactive Referential Grounding with Imperfect Language in 3D Scenes
Haochen Zhang
Nader Zantout
Pujith Kachana
Ji Zhang
Wenshan Wang
VGen
51
0
0
20 Mar 2025
Exploring 3D Activity Reasoning and Planning: From Implicit Human Intentions to Route-Aware Planning
Xueying Jiang
Wenhao Li
Xiaoqin Zhang
Ling Shao
Shijian Lu
LRM
47
0
0
17 Mar 2025
ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
Qihang Peng
Henry Zheng
Gao Huang
3DPC
84
0
0
26 Feb 2025
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection
Boyu Mi
Hanqing Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
74
0
0
21 Feb 2025
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring
Xinyi Wang
Na Zhao
Zhiyuan Han
Dan Guo
Xun Yang
48
1
0
17 Jan 2025
GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models
Zhangyang Qi
Zhixiong Zhang
Ye Fang
Jiaqi Wang
Hengshuang Zhao
83
6
0
02 Jan 2025
LidaRefer: Outdoor 3D Visual Grounding for Autonomous Driving with Transformers
Yeong-Seung Baek
Heung-Seon Oh
31
0
0
07 Nov 2024
VLA-3D: A Dataset for 3D Semantic Scene Understanding and Navigation
Haochen Zhang
Nader Zantout
Pujith Kachana
Zongyuan Wu
Ji Zhang
Wenshan Wang
3DV
LM&Ro
41
5
0
05 Nov 2024
Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
Haomeng Zhang
Chiao-An Yang
Raymond A. Yeh
39
1
0
29 Oct 2024
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
Runsen Xu
Zhiwei Huang
Tai Wang
Y. Chen
Jiangmiao Pang
Dahua Lin
VGen
41
11
0
17 Oct 2024
SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators
Rasoul Shafipour
David Harrison
Maxwell Horton
Jeffrey Marker
Houman Bedayat
Sachin Mehta
Mohammad Rastegari
Mahyar Najibi
Saman Naderiparizi
MQ
57
0
0
14 Oct 2024
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Chenming Zhu
Tai Wang
Wenwei Zhang
Jiangmiao Pang
Xihui Liu
134
30
0
26 Sep 2024
QueryCAD: Grounded Question Answering for CAD Models
Claudius Kienle
Benjamin Alt
Darko Katic
Rainer Jäkel
Jan Peters
23
2
0
13 Sep 2024
Bayesian Self-Training for Semi-Supervised 3D Segmentation
Ozan Unal
Christos Sakaridis
Luc Van Gool
3DPC
3DV
34
0
0
12 Sep 2024
R2G: Reasoning to Ground in 3D Scenes
Yixuan Li
Zan Wang
Wei Liang
41
2
0
24 Aug 2024
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models
Xiaoyu Zhu
Hao Zhou
Pengfei Xing
Long Zhao
Hao Xu
Junwei Liang
Alex Hauptmann
Ting Liu
Andrew C. Gallagher
DiffM
59
4
0
18 Jul 2024
Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
Yang Liu
Weixing Chen
Yongjie Bai
Xiaodan Liang
Guanbin Li
Wen Gao
Liang Lin
LM&Ro
SyDa
AI4CE
51
50
0
09 Jul 2024
3D Vision and Language Pretraining with Large-Scale Synthetic Data
Dejie Yang
Zhu Xu
Wentao Mo
Qingchao Chen
Siyuan Huang
Yang Liu
24
5
0
08 Jul 2024
Multi-Task Domain Adaptation for Language Grounding with 3D Objects
Penglei Sun
Yaoxian Song
Xinglin Pan
Peijie Dong
Xiaofei Yang
Qiang-qiang Wang
Zhixu Li
Tiefeng Li
Xiaowen Chu
64
1
0
03 Jul 2024
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities
Chenming Zhu
Tai Wang
Wenwei Zhang
Kai Chen
Xihui Liu
ReLM
LRM
45
16
0
01 Jul 2024
3D Feature Distillation with Object-Centric Priors
Georgios Tziafas
Yucheng Xu
Zhibin Li
H. Kasaei
34
1
0
26 Jun 2024
Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding
Yue Xu
Kaizhi Yang
Jiebo Luo
Xuejin Chen
3DPC
45
1
0
13 Jun 2024
RVT-2: Learning Precise Manipulation from Few Demonstrations
Ankit Goyal
Valts Blukis
Jie Xu
Yijie Guo
Yu-Wei Chao
Dieter Fox
35
38
0
12 Jun 2024
A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions
Daizong Liu
Yang Liu
Wencan Huang
Wei Hu
LM&Ro
35
9
0
09 Jun 2024
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Runwei Guan
Ruixiao Zhang
Ningwei Ouyang
Jianan Liu
Ka Lok Man
...
Ming Xu
Jeremy S. Smith
Eng Gee Lim
Yutao Yue
Hui Xiong
51
9
0
21 May 2024
Unifying 3D Vision-Language Understanding via Promptable Queries
Ziyu Zhu
Zhuofan Zhang
Xiaojian Ma
Xuesong Niu
Yixin Chen
Baoxiong Jia
Zhidong Deng
Siyuan Huang
Qing Li
48
21
0
19 May 2024
Grounded 3D-LLM with Referent Tokens
Yilun Chen
Shuai Yang
Haifeng Huang
Tai Wang
Ruiyuan Lyu
Runsen Xu
Dahua Lin
Jiangmiao Pang
53
22
0
16 May 2024
Generating Human Motion in 3D Scenes from Text Descriptions
Zhi Cen
Huaijin Pi
Sida Peng
Zehong Shen
Minghui Yang
Shuai Zhu
Hujun Bao
Xiaowei Zhou
50
19
0
13 May 2024
Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners
Chun Feng
Joy Hsu
Weiyu Liu
Jiajun Wu
PINN
LRM
43
6
0
30 Apr 2024
Transcrib3D: 3D Referring Expression Resolution through Large Language Models
Jiading Fang
Xiangshan Tan
Shengjie Lin
Igor Vasiljevic
Vitor Campagnolo Guizilini
Hongyuan Mei
Rares Ambrus
Gregory Shakhnarovich
Matthew R. Walter
LM&Ro
41
4
0
30 Apr 2024
"Where am I?" Scene Retrieval with Language
Jiaqi Chen
Dániel Baráth
Iro Armeni
Marc Pollefeys
Hermann Blum
LM&Ro
58
5
0
22 Apr 2024
Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection
Ying Zhang
Yuezun Li
Bo Peng
Jiaran Zhou
Huiyu Zhou
Junyu Dong
45
0
0
17 Apr 2024
Data-Efficient 3D Visual Grounding via Order-Aware Referring
Tung-Yu Wu
Sheng-Yu Huang
Yu-Chiang Frank Wang
34
0
0
25 Mar 2024
Can 3D Vision-Language Models Truly Understand Natural Language?
Weipeng Deng
Jihan Yang
Runyu Ding
Jiahui Liu
Yijiang Li
Xiaojuan Qi
Edith C.H. Ngai
37
4
0
21 Mar 2024
SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention
Feng Xiao
Hongbin Xu
Qiuxia Wu
Wenxiong Kang
34
2
0
13 Mar 2024
MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding
Chun-Peng Chang
Shaoxiang Wang
A. Pagani
Didier Stricker
40
7
0
05 Mar 2024
HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections
Chen Dudai
Morris Alper
Hana Bezalel
Rana Hanocka
Itai Lang
Hadar Averbuch-Elor
23
2
0
14 Feb 2024
P2M2-Net: Part-Aware Prompt-Guided Multimodal Point Cloud Completion
Linlian Jiang
Pan Chen
Ye Wang
Tieru Wu
Rui Ma
3DPC
32
0
0
29 Dec 2023
1
2
Next