Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.19838
Cited By
Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving
28 March 2024
Akshay Gopalkrishnan
Ross Greer
Mohan M. Trivedi
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving"
12 / 12 papers shown
Title
DriveSOTIF: Advancing Perception SOTIF Through Multimodal Large Language Models
Shucheng Huang
Freda Shi
Chen Sun
Jiaming Zhong
Minghao Ning
Yufeng Yang
Yukun Lu
Hong Wang
A. Khajepour
31
0
0
11 May 2025
A Review of 3D Object Detection with Vision-Language Models
Ranjan Sapkota
Konstantinos I Roumeliotis
Rahul Harsha Cheppally
Marco Flores Calero
Manoj Karkee
VLM
82
2
0
25 Apr 2025
NuScenes-SpatialQA: A Spatial Understanding and Reasoning Benchmark for Vision-Language Models in Autonomous Driving
Kexin Tian
Jingrui Mao
Y. Zhang
Jiwan Jiang
Yang Zhou
Zhengzhong Tu
CoGe
68
0
0
04 Apr 2025
Urban Computing in the Era of Large Language Models
Zhonghang Li
Lianghao Xia
Xubin Ren
J. Tang
Tianyi Chen
Yong-mei Xu
Chenyu Huang
83
0
0
02 Apr 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
70
3
0
11 Feb 2025
LaVida Drive: Vision-Text Interaction VLM for Autonomous Driving with Token Selection, Recovery and Enhancement
Siwen Jiao
Yangyi Fang
Baoyun Peng
Wangqun Chen
Bharadwaj Veeravalli
85
4
0
20 Nov 2024
Perception Without Vision for Trajectory Prediction: Ego Vehicle Dynamics as Scene Representation for Efficient Active Learning in Autonomous Driving
Ross Greer
Mohan M. Trivedi
34
2
0
15 May 2024
Towards Explainable, Safe Autonomous Driving with Language Embeddings for Novelty Identification and Active Learning: Framework and Experimental Analysis with Real-World Data Sets
Ross Greer
Mohan M. Trivedi
34
19
0
11 Feb 2024
MIVC: Multiple Instance Visual Component for Visual-Language Models
Wenyi Wu
Qi Li
Leon Wenliang Zhong
Junzhou Huang
31
3
0
28 Dec 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
272
4,244
0
30 Jan 2023
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
370
8,495
0
28 Jan 2022
Unifying Vision-and-Language Tasks via Text Generation
Jaemin Cho
Jie Lei
Hao Tan
Joey Tianyi Zhou
MLLM
256
525
0
04 Feb 2021
1