ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.14414
  4. Cited By
Vision Language Models in Autonomous Driving: A Survey and Outlook

Vision Language Models in Autonomous Driving: A Survey and Outlook

22 October 2023
Xingcheng Zhou
Mingyu Liu
Ekim Yurtsever
B. L. Žagar
Walter Zimmer
Hu Cao
Alois C. Knoll
    VLM
ArXivPDFHTML

Papers citing "Vision Language Models in Autonomous Driving: A Survey and Outlook"

36 / 36 papers shown
Title
Towards Human-Centric Autonomous Driving: A Fast-Slow Architecture Integrating Large Language Model Guidance with Reinforcement Learning
Towards Human-Centric Autonomous Driving: A Fast-Slow Architecture Integrating Large Language Model Guidance with Reinforcement Learning
Chengkai Xu
Jiaqi Liu
Yicheng Guo
Wenjie Qu
Peng Hang
Jian Sun
31
0
0
11 May 2025
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks
Baoxia Du
H. Du
Dusit Niyato
Ruidong Li
58
0
0
05 May 2025
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
Xuyang Guo
Jiayan Huo
Zhenmei Shi
Zhao Song
Jiahao Zhang
Jiale Zhao
EGVM
VGen
PINN
82
1
0
01 May 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
28
0
0
07 Apr 2025
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model
Xingcheng Zhou
Xuyuan Han
Feng Yang
Yunpu Ma
Alois C. Knoll
VLM
61
1
0
30 Mar 2025
O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models
O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models
Ashshak Sharifdeen
Muhammad Akhtar Munir
Sanoojan Baliah
Salman Khan
M. H. Khan
VLM
54
0
0
15 Mar 2025
Towards Vision Zero: The Accid3nD Dataset
Towards Vision Zero: The Accid3nD Dataset
Walter Zimmer
Ross Greer
Daniel Lehmberg
Marc Pavel
Holger Caesar
...
Mohan M. Trivedi
Rui Song
Hu Cao
Akshay Gopalkrishnan
Alois C. Knoll
55
0
0
15 Mar 2025
Road Rage Reasoning with Vision-language Models (VLMs): Task Definition and Evaluation Dataset
Yibing Weng
Yu Gu
Fuji Ren
63
0
0
14 Mar 2025
A Framework for a Capability-driven Evaluation of Scenario Understanding for Multimodal Large Language Models in Autonomous Driving
Tin Stribor Sohn
Philipp Reis
Maximilian Dillitzer
Johannes Bach
Jason J. Corso
Eric Sax
ELM
LRM
56
0
0
14 Mar 2025
Talk2PC: Enhancing 3D Visual Grounding through LiDAR and Radar Point Clouds Fusion for Autonomous Driving
Runwei Guan
Jianan Liu
Ningwei Ouyang
Daizong Liu
Xiaolou Sun
Lianqing Zheng
Ming Xu
Yutao Yue
Hui Xiong
66
1
0
11 Mar 2025
VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models
Jiacheng Ruan
Wenzhen Yuan
Xian Gao
Ye Guo
Daoxin Zhang
Zhe Xu
Yao Hu
Ting Liu
Yuzhuo Fu
LRM
VLM
68
4
0
10 Mar 2025
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models
CurricuVLM: Towards Safe Autonomous Driving via Personalized Safety-Critical Curriculum Learning with Vision-Language Models
Zihao Sheng
Zilin Huang
Yansong Qu
Yue Leng
Sruthi Bhavanam
Sikai Chen
48
2
0
24 Feb 2025
Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances
Multi-Agent Autonomous Driving Systems with Large Language Models: A Survey of Recent Advances
Yaozu Wu
Dongyuan Li
Yankai Chen
Renhe Jiang
Henry Peng Zou
Liancheng Fang
Zhen Wang
Philip S. Yu
LLMAG
73
2
0
24 Feb 2025
INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation
INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation
Dianwei Chen
Zifan Zhang
Yuchen Liu
Xianfeng Terry Yang
VLM
62
3
0
01 Feb 2025
Enhancing Highway Safety: Accident Detection on the A9 Test Stretch Using Roadside Sensors
Enhancing Highway Safety: Accident Detection on the A9 Test Stretch Using Roadside Sensors
Walter Zimmer
Ross Greer
Xingcheng Zhou
Rui Song
Marc Pavel
Daniel Lehmberg
Ahmed Ghita
Akshay Gopalkrishnan
Mohan M. Trivedi
Alois C. Knoll
76
1
0
01 Feb 2025
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception
RLS3: RL-Based Synthetic Sample Selection to Enhance Spatial Reasoning in Vision-Language Models for Indoor Autonomous Perception
Joshua R. Waite
Md Zahid Hasan
Qisai Liu
Zhanhong Jiang
Chinmay Hegde
S. Sarkar
OffRL
SyDa
179
1
0
31 Jan 2025
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions
Integrating LLMs with ITS: Recent Advances, Potentials, Challenges, and Future Directions
Doaa Mahmud
Hadeel Hajmohamed
Shamma Almentheri
Shamma Alqaydi
Lameya Aldhaheri
R. A. Khalil
Nasir Saeed
AI4TS
40
5
0
08 Jan 2025
Large-scale moral machine experiment on large language models
Large-scale moral machine experiment on large language models
Muhammad Shahrul Zaim bin Ahmad
Kazuhiro Takemoto
ELM
AILaw
41
1
1
31 Dec 2024
World Models: The Safety Perspective
World Models: The Safety Perspective
Zifan Zeng
Chongzhe Zhang
Feng Liu
Joseph Sifakis
Qunli Zhang
Shiming Liu
Peng Wang
KELM
LLMAG
42
1
0
12 Nov 2024
From Words to Wheels: Automated Style-Customized Policy Generation for
  Autonomous Driving
From Words to Wheels: Automated Style-Customized Policy Generation for Autonomous Driving
Xu Han
Xianda Chen
Zhenghan Cai
Pinlong Cai
Meixin Zhu
Xiaowen Chu
42
1
0
18 Sep 2024
Multi-Frame Vision-Language Model for Long-form Reasoning in Driver
  Behavior Analysis
Multi-Frame Vision-Language Model for Long-form Reasoning in Driver Behavior Analysis
Hiroshi Takato
Hiroshi Tsutsui
Komei Soda
Hidetaka Kamigaito
VLM
35
0
0
03 Aug 2024
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension
Runwei Guan
Ruixiao Zhang
Ningwei Ouyang
Jianan Liu
Ka Lok Man
...
Ming Xu
Jeremy S. Smith
Eng Gee Lim
Yutao Yue
Hui Xiong
53
9
0
21 May 2024
DriveVLM: The Convergence of Autonomous Driving and Large
  Vision-Language Models
DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models
Xiaoyu Tian
Junru Gu
Bailin Li
Yicheng Liu
Yang Wang
Chenxu Hu
Kun Zhan
Peng Jia
Xianpeng Lang
Hang Zhao
VLM
73
126
0
19 Feb 2024
DrivingDiffusion: Layout-Guided multi-view driving scene video
  generation with latent diffusion model
DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model
Xiaofan Li
Yifu Zhang
Xiaoqing Ye
VGen
73
71
0
11 Oct 2023
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
123
159
0
04 Oct 2023
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
VAD: Vectorized Scene Representation for Efficient Autonomous Driving
Bo Jiang
Shaoyu Chen
Qing Xu
Bencheng Liao
Jiajie Chen
Helong Zhou
Qian Zhang
Wenyu Liu
Chang Huang
Xinggang Wang
110
195
0
21 Mar 2023
V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle
  Cooperative Perception
V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception
Runsheng Xu
Xin Xia
Jinlong Li
Hanzhao Li
Shuo Zhang
...
Xiaoyu Dong
Rui Song
Hongkai Yu
Bolei Zhou
Jiaqi Ma
59
149
0
14 Mar 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image
  Encoders and Large Language Models
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
287
4,261
0
30 Jan 2023
DRAMA: Joint Risk Localization and Captioning in Driving
DRAMA: Joint Risk Localization and Captioning in Driving
Srikanth Malla
Chiho Choi
Isht Dwivedi
Joonhyang Choi
Jiachen Li
107
87
0
22 Sep 2022
Real-Time And Robust 3D Object Detection with Roadside LiDARs
Real-Time And Robust 3D Object Detection with Roadside LiDARs
Walter Zimmer
Jialong Wu
Xin Zhou
Alois C. Knoll
3DPC
34
11
0
11 Jul 2022
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision
  Transformer
V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer
Runsheng Xu
Hao Xiang
Zhengzhong Tu
Xin Xia
Ming-Hsuan Yang
Jiaqi Ma
ViT
117
362
0
20 Mar 2022
Towards Efficient Post-training Quantization of Pre-trained Language
  Models
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
79
47
0
30 Sep 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
259
560
0
28 Sep 2021
Learning to Prompt for Vision-Language Models
Learning to Prompt for Vision-Language Models
Kaiyang Zhou
Jingkang Yang
Chen Change Loy
Ziwei Liu
VPVLM
CLIP
VLM
348
2,271
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,858
0
18 Apr 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
1