Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.22306
Cited By
Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
29 October 2024
Haomeng Zhang
Chiao-An Yang
Raymond A. Yeh
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention"
28 / 28 papers shown
Title
Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
Zhihao Yuan
Jinke Ren
Chun-Mei Feng
Hengshuang Zhao
Shuguang Cui
Zhen Li
98
30
0
26 Nov 2023
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent
Jianing Yang
Xuweiyi Chen
Shengyi Qian
Nikhil Madaan
Madhavan Iyengar
David Fouhey
Joyce Chai
LM&Ro
LLMAG
142
101
0
21 Sep 2023
Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding
Ozan Unal
Daniel Gehrig
Suman Saha
Luc Van Gool
72
17
0
08 Sep 2023
NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations
Joy Hsu
Jiayuan Mao
Jiajun Wu
PINN
83
53
0
23 Mar 2023
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Ronghang Hu
Xinlei Chen
Matthias Nießner
Angel X. Chang
113
54
0
01 Dec 2022
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding
Shizhe Chen
Pierre-Louis Guhur
Makarand Tapaswi
Cordelia Schmid
Ivan Laptev
90
88
0
17 Nov 2022
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
Tianyu Huang
Bowen Dong
Yunhan Yang
Xiaoshui Huang
Rynson W. H. Lau
Wanli Ouyang
W. Zuo
VLM
3DPC
CLIP
122
149
0
03 Oct 2022
TransVG++: End-to-End Visual Grounding with Language Conditioned Vision Transformer
Jiajun Deng
Zhengyuan Yang
Daqing Liu
Tianlang Chen
Wen-gang Zhou
Yanyong Zhang
Houqiang Li
Wanli Ouyang
ViT
94
53
0
14 Jun 2022
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
Jun-Bin Luo
Jiahui Fu
Xianghao Kong
Chen Gao
Haibing Ren
Hao Shen
Huaxia Xia
Si Liu
87
95
0
13 Apr 2022
Multi-View Transformer for 3D Visual Grounding
Shijia Huang
Yilun Chen
Jiaya Jia
Liwei Wang
96
127
0
05 Apr 2022
TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers
Xuyang Bai
Zeyu Hu
Xinge Zhu
Qingqiu Huang
Yilun Chen
Hongbo Fu
Chiew-Lan Tai
ViT
3DPC
118
612
0
22 Mar 2022
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields
Can Wang
Menglei Chai
Mingming He
Dongdong Chen
Jing Liao
CLIP
130
386
0
09 Dec 2021
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
Dave Zhenyu Chen
Qirui Wu
Matthias Nießner
Angel X. Chang
73
32
0
02 Dec 2021
TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding
Dailan He
Yusheng Zhao
Junyu Luo
Tianrui Hui
Shaofei Huang
Aixi Zhang
Si Liu
ViT
51
95
0
05 Aug 2021
LanguageRefer: Spatial-Language Model for 3D Visual Grounding
Junha Roh
Karthik Desingh
Ali Farhadi
Dieter Fox
77
95
0
07 Jul 2021
Referring Transformer: A One-step Approach to Multi-task Visual Grounding
Muchen Li
Leonid Sigal
ObjD
107
193
0
06 Jun 2021
SAT: 2D Semantics Assisted Training for 3D Visual Grounding
Zhengyuan Yang
Songyang Zhang
Liwei Wang
Jiebo Luo
3DPC
88
126
0
24 May 2021
Graph-Structured Referring Expression Reasoning in The Wild
Sibei Yang
Guanbin Li
Yizhou Yu
NAI
67
95
0
19 Apr 2020
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
Li Jiang
Hengshuang Zhao
Shaoshuai Shi
Shu Liu
Chi-Wing Fu
Jiaya Jia
3DPC
86
437
0
03 Apr 2020
ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
C. Qi
Xinlei Chen
Or Litany
Leonidas Guibas
3DPC
241
252
0
29 Jan 2020
ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
Dave Zhenyu Chen
Angel X. Chang
Matthias Nießner
3DPC
102
379
0
18 Dec 2019
Dynamic Graph Attention for Referring Expression Comprehension
Sibei Yang
Guanbin Li
Yizhou Yu
OCL
75
220
0
18 Sep 2019
Zero-Shot Grounding of Objects from Natural Language Queries
Arka Sadhu
Kan Chen
Ram Nevatia
ObjD
91
159
0
20 Aug 2019
A Fast and Accurate One-Stage Approach to Visual Grounding
Zhengyuan Yang
Boqing Gong
Liwei Wang
Wenbing Huang
Dong Yu
Jiebo Luo
ObjD
58
364
0
18 Aug 2019
Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks
Peng Wang
Qi Wu
Jiewei Cao
Chunhua Shen
Lianli Gao
Anton Van Den Hengel
ObjD
93
255
0
12 Dec 2018
MAttNet: Modular Attention Network for Referring Expression Comprehension
Licheng Yu
Zhe Lin
Xiaohui Shen
Jimei Yang
Xin Lu
Joey Tianyi Zhou
Tamara L. Berg
ObjD
119
831
0
24 Jan 2018
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Bohan Zhuang
Qi Wu
Chunhua Shen
Ian Reid
Anton Van Den Hengel
ObjD
68
134
0
17 Nov 2017
ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes
Angela Dai
Angel X. Chang
Manolis Savva
Maciej Halber
Thomas Funkhouser
Matthias Nießner
3DPC
3DV
513
4,088
0
14 Feb 2017
1