ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.12289
  4. Cited By
DriveVLM: The Convergence of Autonomous Driving and Large
  Vision-Language Models

DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models

19 February 2024
Xiaoyu Tian
Junru Gu
Bailin Li
Yicheng Liu
Yang Wang
Chenxu Hu
Kun Zhan
Peng Jia
Xianpeng Lang
Hang Zhao
    VLM
ArXivPDFHTML

Papers citing "DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models"

34 / 34 papers shown
Title
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving
Zongchuang Zhao
Haoyu Fu
Dingkang Liang
Xin Zhou
Dingyuan Zhang
Hongwei Xie
Bing Wang
Xiang Bai
MLLM
VLM
49
0
0
13 May 2025
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
ReLI: A Language-Agnostic Approach to Human-Robot Interaction
Linus Nwankwo
Bjoern Ellensohn
Ozan Özdenizci
Elmar Rueckert
LM&Ro
55
0
0
03 May 2025
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning
Lang Feng
Weihao Tan
Zhiyi Lyu
Longtao Zheng
Haiyang Xu
M. Yan
Fei Huang
Bo An
26
0
0
01 May 2025
V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving
V3LMA: Visual 3D-enhanced Language Model for Autonomous Driving
Jannik Lübberstedt
Esteban Rivera
Nico Uhlemann
Markus Lienkamp
MLLM
63
0
0
30 Apr 2025
LangCoop: Collaborative Driving with Language
LangCoop: Collaborative Driving with Language
Xiangbo Gao
Yuheng Wu
Rujia Wang
Chenxi Liu
Yang Zhou
Zhengzhong Tu
VLM
38
0
0
18 Apr 2025
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
Kexin Tian
Jingrui Mao
Y. Zhang
Jiwan Jiang
Yang Zhou
Zhengzhong Tu
CoGe
68
0
0
04 Apr 2025
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?
Kai Yan
Yufei Xu
Zhengyin Du
Xuesong Yao
Z. Wang
Xiaowen Guo
Jiecao Chen
ReLM
ELM
LRM
95
3
0
01 Apr 2025
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Y. Wang
Shengqiong Wu
Y. Zhang
William Yang Wang
Ziwei Liu
Jiebo Luo
Hao Fei
LRM
92
8
0
16 Mar 2025
TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models
TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models
Xiangyu Yin
Yi Qi
Jinwei Hu
Zhen Chen
Yi Dong
Xingyu Zhao
Xiaowei Huang
Wenjie Ruan
45
0
0
13 Mar 2025
Unlock the Power of Unlabeled Data in Language Driving Model
Unlock the Power of Unlabeled Data in Language Driving Model
Chaoqun Wang
Jie-jin Yang
Xiaobin Hong
Ruimao Zhang
53
0
0
13 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Fu Liu
Peng Jia
Xianpeng Lang
Xiaolong Sun
VGen
155
0
0
12 Mar 2025
CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving
Changxing Liu
Genjia Liu
Z. Wang
Jinchang Yang
Siheng Chen
62
0
0
11 Mar 2025
Towards Effective and Efficient Context-aware Nucleus Detection in Histopathology Whole Slide Images
Zhongyi Shui
Ruizhe Guo
Honglin Li
Yuxuan Sun
Yunlong Zhang
Chenglu Zhu
Jiatong Cai
Pingyi Chen
Yanzhou Su
Lin Yang
46
0
0
04 Mar 2025
Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models
Stealthy Backdoor Attack in Self-Supervised Learning Vision Encoders for Large Vision Language Models
Zhaoyi Liu
Huan Zhang
AAML
80
0
0
25 Feb 2025
ConvoyLLM: Dynamic Multi-Lane Convoy Control Using LLMs
ConvoyLLM: Dynamic Multi-Lane Convoy Control Using LLMs
Liping Lu
Zhican He
Duanfeng Chu
Rukang Wang
Saiqian Peng
Pan Zhou
36
0
0
24 Feb 2025
DriveLM: Driving with Graph Visual Question Answering
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima
Katrin Renz
Kashyap Chitta
L. Chen
Hanxue Zhang
Chengen Xie
Jens Beißwenger
Ping Luo
Andreas Geiger
Hongyang Li
96
162
0
17 Jan 2025
CoDriveVLM: VLM-Enhanced Urban Cooperative Dispatching and Motion Planning for Future Autonomous Mobility on Demand Systems
CoDriveVLM: VLM-Enhanced Urban Cooperative Dispatching and Motion Planning for Future Autonomous Mobility on Demand Systems
Haichao Liu
Ruoyu Yao
Wenru Liu
Zhenmin Huang
Shaojie Shen
Jun Ma
42
2
0
10 Jan 2025
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan
Hanqin Liu
Yao Huang
Xiaoqi Wang
Caixin Kang
Hang Su
Yinpeng Dong
Xingxing Wei
VGen
93
0
0
04 Dec 2024
LaVida Drive: Vision-Text Interaction VLM for Autonomous Driving with Token Selection, Recovery and Enhancement
LaVida Drive: Vision-Text Interaction VLM for Autonomous Driving with Token Selection, Recovery and Enhancement
Siwen Jiao
Yangyi Fang
Baoyun Peng
Wangqun Chen
Bharadwaj Veeravalli
85
4
0
20 Nov 2024
RenderWorld: World Model with Self-Supervised 3D Label
RenderWorld: World Model with Self-Supervised 3D Label
Ziyang Yan
Wenzhen Dong
Yihua Shao
Yuhang Lu
Liu Haiyang
...
Haozhe Wang
Zhe Wang
Yan Wang
Fabio Remondino
Yuexin Ma
3DV
VGen
72
13
0
17 Sep 2024
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding
Yunze Man
Shuhong Zheng
Zhipeng Bao
M. Hebert
Liang-Yan Gui
Yu-xiong Wang
72
15
0
05 Sep 2024
Enhancing End-to-End Autonomous Driving with Latent World Model
Enhancing End-to-End Autonomous Driving with Latent World Model
Yingyan Li
Lue Fan
Jiawei He
Yuqi Wang
Yuntao Chen
Zhaoxiang Zhang
Tieniu Tan
75
8
0
12 Jun 2024
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive
  Through Work Zones
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones
Anurag Ghosh
R. Tamburo
Shen Zheng
Juan R. Alvarez-Padilla
Hailiang Zhu
Michael Cardei
Nicholas Dunn
Christoph Mertz
Srinivasa G. Narasimhan
44
1
0
11 Jun 2024
DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and
  Social Experiences
DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences
Yidong Huang
Jacob Sansom
Ziqiao Ma
Felix Gervits
Joyce Chai
44
17
0
05 Jun 2024
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification
Laura Fieback
Jakob Spiegelberg
Hanno Gottschalk
MLLM
59
5
0
29 May 2024
Physical Backdoor Attack can Jeopardize Driving with
  Vision-Large-Language Models
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models
Zhenyang Ni
Rui Ye
Yuxian Wei
Zhen Xiang
Yanfeng Wang
Siheng Chen
AAML
36
9
0
19 Apr 2024
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video
  Understanding
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Bo He
Hengduo Li
Young Kyun Jang
Menglin Jia
Xuefei Cao
Ashish Shah
Abhinav Shrivastava
Ser-Nam Lim
MLLM
81
88
0
08 Apr 2024
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
114
159
0
04 Oct 2023
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
Bo Jiang
Shaoyu Chen
Qing Xu
Bencheng Liao
Jiajie Chen
Helong Zhou
Qian Zhang
Wenyu Liu
Chang Huang
Xinggang Wang
110
194
0
21 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
270
4,229
0
30 Jan 2023
DRAMA: Joint Risk Localization and Captioning in Driving
DRAMA: Joint Risk Localization and Captioning in Driving
Srikanth Malla
Chiho Choi
Isht Dwivedi
Joonhyang Choi
Jiachen Li
101
87
0
22 Sep 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
355
8,457
0
28 Jan 2022
Learning to drive from a world on rails
Learning to drive from a world on rails
Di Chen
V. Koltun
Philipp Krahenbuhl
98
116
0
03 May 2021
PointNet: Deep Learning on Point Sets for 3D Classification and
  Segmentation
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
222
14,099
0
02 Dec 2016
1