Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.22436
Cited By
v1
v2 (latest)
NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving
28 March 2025
Fuhao Li
Huan Jin
Bin-Bin Gao
Liaoyuan Fan
Lihui Jiang
Long Zeng
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving"
42 / 42 papers shown
Title
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Yuan Gao
Mattia Piccinini
Yuchen Zhang
Dingrui Wang
Korbinian Moller
...
Steven Peters
Andrea Stocco
Bassam Alrifaee
Marco Pavone
Johannes Betz
39
0
0
13 Jun 2025
RefAV: Towards Planning-Centric Scenario Mining
Cainan Davidson
Deva Ramanan
Neehar Peri
91
2
0
27 May 2025
On the Emergence of Thinking in LLMs I: Searching for the Right Intuition
Guanghao Ye
Khiem Duc Pham
Xinzhi Zhang
Sivakanth Gopi
Baolin Peng
Beibin Li
Janardhan Kulkarni
Huseyin A. Inan
ReLM
LRM
157
10
0
10 Feb 2025
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima
Katrin Renz
Kashyap Chitta
Lawrence Yunliang Chen
Hanxue Zhang
Chengen Xie
Jens Beißwenger
Ping Luo
Andreas Geiger
Hongyang Li
323
208
0
17 Jan 2025
VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks
Jiannan Wu
Muyan Zhong
Sen Xing
Zeqiang Lai
Zhaoyang Liu
...
Lewei Lu
Tong Lu
Ping Luo
Yu Qiao
Jifeng Dai
MLLM
VLM
LRM
376
59
0
03 Jan 2025
LaVida Drive: Vision-Text Interaction VLM for Autonomous Driving with Token Selection, Recovery and Enhancement
Siwen Jiao
Yangyi Fang
Baoyun Peng
Wangqun Chen
Bharadwaj Veeravalli
240
5
0
20 Nov 2024
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Chenming Zhu
Tai Wang
Wenwei Zhang
Jiangmiao Pang
Xihui Liu
272
52
0
26 Sep 2024
LLMI3D: MLLM-based 3D Perception from a Single 2D Image
Fan Yang
Sicheng Zhao
Yanhao Zhang
Haoxiang Chen
Hui Chen
Wenbo Tang
Guiguang Ding
98
3
0
14 Aug 2024
SegPoint: Segment Any Point Cloud via Large Language Model
Shuting He
Henghui Ding
Xudong Jiang
Bihan Wen
3DV
MLLM
3DPC
106
19
0
18 Jul 2024
Bootstrapping Referring Multi-Object Tracking
Yani Zhang
Dongming Wu
Wencheng Han
Xingping Dong
126
8
0
07 Jun 2024
Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving?
Yifan Bai
Dongming Wu
Yingfei Liu
Fan Jia
Weixin Mao
...
Yucheng Zhao
Jianbing Shen
Xing Wei
Tiancai Wang
Xiangyu Zhang
MLLM
94
12
0
28 May 2024
TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes
Bu Jin
Yupeng Zheng
Pengfei Li
Weize Li
Yuhang Zheng
...
Kun Zhan
Peng Jia
Xiaoxiao Long
Yilun Chen
Hao Zhao
3DV
114
20
0
28 Mar 2024
Embodied Understanding of Driving Scenarios
Yunsong Zhou
Linyan Huang
Qingwen Bu
Jia Zeng
Tianyu Li
Hang Qiu
Hongzi Zhu
Minyi Guo
Yu Qiao
Hongyang Li
LM&Ro
111
33
0
07 Mar 2024
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models
Xiaoyu Tian
Junru Gu
Bailin Li
Yicheng Liu
Yang Wang
Chenxu Hu
Kun Zhan
Peng Jia
Xianpeng Lang
Hang Zhao
VLM
218
165
0
19 Feb 2024
Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models
Xinpeng Ding
Jinahua Han
Hang Xu
Xiaodan Liang
Wei Zhang
Xiaomeng Li
111
48
0
02 Jan 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
330
1,217
0
21 Dec 2023
Mono3DVG: 3D Visual Grounding in Monocular Images
Yangfan Zhan
Yuan. Yuan
Zhitong Xiong
MDE
76
14
0
13 Dec 2023
NuScenes-MQA: Integrated Evaluation of Captions and QA for Autonomous Driving Datasets using Markup Annotations
Yuichi Inoue
Yuki Yada
Kotaro Tanahashi
Yu Yamaguchi
76
23
0
11 Dec 2023
Improved Baselines with Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLM
MLLM
260
2,834
0
05 Oct 2023
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model
Zhenhua Xu
Yujia Zhang
Enze Xie
Zhen Zhao
Yong Guo
Kwan-Yee. K. Wong
Zhenguo Li
Hengshuang Zhao
MLLM
154
310
0
02 Oct 2023
Language Prompt for Autonomous Driving
Dongming Wu
Wencheng Han
Tiancai Wang
Yingfei Liu
Cheng-zhong Xu
Jianbing Shen
Jianbing Shen
VLM
136
87
0
08 Sep 2023
LISA: Reasoning Segmentation via Large Language Model
Xin Lai
Zhuotao Tian
Yukang Chen
Yanwei Li
Yuhui Yuan
Shu Liu
Jiaya Jia
LM&Ro
VLM
MLLM
LRM
173
463
0
01 Aug 2023
NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario
Tianwen Qian
Jingjing Chen
Linhai Zhuo
Yang Jiao
Yueping Jiang
102
158
0
24 May 2023
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Deyao Zhu
Jun Chen
Xiaoqian Shen
Xiang Li
Mohamed Elhoseiny
VLM
MLLM
183
2,080
0
20 Apr 2023
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDa
VLM
MLLM
592
4,962
0
17 Apr 2023
Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Shihao Wang
Yingfei Liu
Tiancai Wang
Ying Li
Xiangyu Zhang
3DPC
142
211
0
21 Mar 2023
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu
Zhaoyang Zeng
Tianhe Ren
Feng Li
Hao Zhang
...
Chun-yue Li
Jianwei Yang
Hang Su
Jun Zhu
Lei Zhang
ObjD
271
2,037
0
09 Mar 2023
Referring Multi-Object Tracking
Dongming Wu
Wencheng Han
Tiancai Wang
Xingping Dong
Xiangyu Zhang
Jianbing Shen
114
80
0
06 Mar 2023
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
Yanmin Wu
Xinhua Cheng
Renrui Zhang
Zesen Cheng
Jian Zhang
164
70
0
29 Sep 2022
BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection
Yinhao Li
Zheng Ge
Guanyi Yu
Jinrong Yang
Zengran Wang
Yukang Shi
Jian‐Yuan Sun
Zeming Li
MDE
157
634
0
21 Jun 2022
UCC: Uncertainty guided Cross-head Co-training for Semi-Supervised Semantic Segmentation
Jiashuo Fan
Bin-Bin Gao
Huan Jin
Lihui Jiang
115
64
0
20 May 2022
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection
Junjie Huang
Guan Huang
202
347
0
31 Mar 2022
PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Yingfei Liu
Tiancai Wang
Xinming Zhang
Jian Sun
3DPC
195
556
0
10 Mar 2022
BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View
Junjie Huang
Guan Huang
Zheng Zhu
Yun Ye
Dalong Du
3DPC
186
711
0
22 Dec 2021
Grounded Language-Image Pre-training
Liunian Harold Li
Pengchuan Zhang
Haotian Zhang
Jianwei Yang
Chunyuan Li
...
Lu Yuan
Lei Zhang
Lei Li
Kai-Wei Chang
Jianfeng Gao
ObjD
VLM
223
1,073
0
07 Dec 2021
Focus on Local: Detecting Lane Marker from Bottom Up via Key Point
Z. Qu
Huan Jin
Yang Zhou
Zhen Yang
Wei Zhang
126
134
0
28 May 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
860
41,948
0
22 Oct 2020
Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D
Jonah Philion
Sanja Fidler
144
1,064
0
13 Aug 2020
Talk2Car: Taking Control of Your Self-Driving Car
Thierry Deruyttere
Simon Vandenhende
Dusan Grujicic
Luc Van Gool
Marie-Francine Moens
LM&Ro
78
134
0
24 Sep 2019
An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection
Youngwan Lee
Joong-won Hwang
Sangrok Lee
Yuseok Bae
Jongyoul Park
PINN
ObjD
87
374
0
22 Apr 2019
nuScenes: A multimodal dataset for autonomous driving
Holger Caesar
Varun Bankiti
Alex H. Lang
Sourabh Vora
Venice Erin Liong
Qiang Xu
Anush Krishnan
Yuxin Pan
G. Baldan
Oscar Beijbom
3DPC
478
5,809
0
26 Mar 2019
Decoupled Weight Decay Regularization
I. Loshchilov
Frank Hutter
OffRL
175
2,166
0
14 Nov 2017
1