ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2401.08045
  4. Cited By
Forging Vision Foundation Models for Autonomous Driving: Challenges,
  Methodologies, and Opportunities

Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities

16 January 2024
Xu Yan
Haiming Zhang
Yingjie Cai
Jingming Guo
Weichao Qiu
Bin-Bin Gao
Kaiqiang Zhou
Yue Zhao
Huan Jin
Jiantao Gao
Zhen Li
Lihui Jiang
Wei Zhang
Hongbo Zhang
Dengxin Dai
Bingbing Liu
ArXivPDFHTML

Papers citing "Forging Vision Foundation Models for Autonomous Driving: Challenges, Methodologies, and Opportunities"

50 / 79 papers shown
Title
A Survey of World Models for Autonomous Driving
A Survey of World Models for Autonomous Driving
Tuo Feng
Wenguan Wang
Yue Yang
VGen
124
7
0
20 Jan 2025
DriveLM: Driving with Graph Visual Question Answering
DriveLM: Driving with Graph Visual Question Answering
Chonghao Sima
Katrin Renz
Kashyap Chitta
Lawrence Yunliang Chen
Hanxue Zhang
Chengen Xie
Jens Beißwenger
Ping Luo
Andreas Geiger
Hongyang Li
153
187
0
17 Jan 2025
UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations
UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations
Y. Ren
Guile Wu
Runhao Li
Zheyuan Yang
Yibo Liu
Xingxin Chen
Tongtong Cao
Bingbing Liu
3DGS
105
4
0
22 Nov 2024
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR
  Understanding
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding
Senqiao Yang
Jiaming Liu
Ray Zhang
Mingjie Pan
Zoey Guo
Xiaoqi Li
Zehui Chen
Peng Gao
Yandong Guo
Shanghang Zhang
3DV
62
65
0
21 Dec 2023
All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes
All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes
J. L. Gómez
Manuel Silva
Antonio Seoane
Agnes Borrás
Mario Noriega
Germán Ros
Jose A. Iglesias-Guitian
Antonio M. López
3DPC
140
12
0
19 Dec 2023
DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic
  Autonomous Driving Scenes
DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes
Xiaoyu Zhou
Zhiwei Lin
Xiaojun Shan
Yongtao Wang
Deqing Sun
Ming-Hsuan Yang
3DGS
95
190
0
13 Dec 2023
OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving
OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving
Wenzhao Zheng
Weiliang Chen
Yuanhui Huang
Borui Zhang
Yueqi Duan
Jiwen Lu
VGen
86
77
0
27 Nov 2023
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
Yuanhui Huang
Wenzhao Zheng
Borui Zhang
Jie Zhou
Jiwen Lu
3DPC
86
71
0
21 Nov 2023
On the Road with GPT-4V(ision): Early Explorations of Visual-Language
  Model on Autonomous Driving
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving
Licheng Wen
Xuemeng Yang
Daocheng Fu
Xiaofeng Wang
Pinlong Cai
...
Xinyu Cai
Min Dou
Shuanglu Hu
Botian Shi
Yu Qiao
VLM
60
83
0
09 Nov 2023
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
Zhenjie Yang
Xiaosong Jia
Hongyang Li
Junchi Yan
ELM
71
106
0
02 Nov 2023
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
Haoyi Zhu
Honghui Yang
Xiaoyang Wu
Di Huang
Sha Zhang
...
Hengshuang Zhao
Chunhua Shen
Yu Qiao
Tong He
Wanli Ouyang
SSL
100
44
0
12 Oct 2023
DrivingDiffusion: Layout-Guided multi-view driving scene video
  generation with latent diffusion model
DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model
Xiaofan Li
Yifu Zhang
Xiaoqing Ye
VGen
88
75
0
11 Oct 2023
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
Hao Sha
Yao Mu
Yuxuan Jiang
Li Chen
Chenfeng Xu
Ping Luo
Shengbo Eben Li
Masayoshi Tomizuka
Wei Zhan
Mingyu Ding
195
170
0
04 Oct 2023
GAIA-1: A Generative World Model for Autonomous Driving
GAIA-1: A Generative World Model for Autonomous Driving
Masane Fuchi
Lloyd Russell
Hudson Yeo
Zak Murez
Hiroto Minami
Alex Kendall
Tomohiro Takagi
Gianluca Corrado
VGen
72
230
0
29 Sep 2023
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large
  Language Models
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models
Licheng Wen
Daocheng Fu
Xin Li
Xinyu Cai
Tengyu Ma
Pinlong Cai
Min Dou
Botian Shi
Liang He
Yu Qiao
52
151
0
28 Sep 2023
Few-Shot Panoptic Segmentation With Foundation Models
Few-Shot Panoptic Segmentation With Foundation Models
Markus Kappeler
Kürsat Petek
Niclas Vodisch
Wolfram Burgard
Abhinav Valada
44
17
0
19 Sep 2023
RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering
  Supervision
RenderOcc: Vision-Centric 3D Occupancy Prediction with 2D Rendering Supervision
Mingjie Pan
Jiaming Liu
Renrui Zhang
Peixiang Huang
Xiaoqi Li
Bing Wang
Hongwei Xie
Li Liu
Shanghang Zhang
74
84
0
18 Sep 2023
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
  Segmentation
MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation
Cheng Chen
Juzheng Miao
Dufan Wu
Zhiling Yan
Sekeun Kim
...
Lichao Sun
Xiang Li
Tianming Liu
Pheng-Ann Heng
Quanzheng Li
MedIm
81
60
0
16 Sep 2023
Language Prompt for Autonomous Driving
Language Prompt for Autonomous Driving
Dongming Wu
Wencheng Han
Tiancai Wang
Yingfei Liu
Cheng-zhong Xu
Jianbing Shen
Jianbing Shen
VLM
77
81
0
08 Sep 2023
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
  Models
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models
Wei Wu
Yuzhong Zhao
Hao Chen
Yuchao Gu
Rui Zhao
Yefei He
Hong Zhou
Mike Zheng Shou
Chunhua Shen
66
100
0
11 Aug 2023
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen
  Convolutional CLIP
Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
Qihang Yu
Ju He
XueQing Deng
Xiaohui Shen
Liang-Chieh Chen
VLM
CLIP
64
144
0
04 Aug 2023
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View
  Transformation
FB-OCC: 3D Occupancy Prediction based on Forward-Backward View Transformation
Zhiqi Li
Zhiding Yu
David Austin
Mingsheng Fang
Shiyi Lan
Jan Kautz
J. Álvarez
67
105
0
04 Jul 2023
StreetSurf: Extending Multi-view Implicit Surface Reconstruction to
  Street Views
StreetSurf: Extending Multi-view Implicit Surface Reconstruction to Street Views
Jianfei Guo
Nianchen Deng
Xinyang Li
Yeqi Bai
Botian Shi
Chiyu Wang
Chenjing Ding
Dongliang Wang
Yikang Li
58
89
0
08 Jun 2023
AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud
  Dataset
AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset
Jiakang Yuan
Bo Zhang
Xiangchao Yan
Tao Chen
Botian Shi
Yikang Li
Yu Qiao
3DPC
44
26
0
01 Jun 2023
LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using
  Online Camera Distillation
LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
Song Wang
Wentong Li
Wenyu Liu
Xiaolu Liu
Jianke Zhu
59
18
0
22 Apr 2023
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
Yi Wei
Linqing Zhao
Wenzhao Zheng
Zhengbiao Zhu
Jie Zhou
Jiwen Lu
3DPC
57
219
0
16 Mar 2023
Rethinking Range View Representation for LiDAR Segmentation
Rethinking Range View Representation for LiDAR Segmentation
Lingdong Kong
You-Chen Liu
Runnan Chen
Yuexin Ma
Xinge Zhu
Yikang Li
Yuenan Hou
Yu Qiao
Ziwei Liu
3DPC
63
118
0
09 Mar 2023
OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion
OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion
Ruihang Miao
Weizhou Liu
Ming-lei Chen
Zheng Gong
Weixin Xu
Chen Hu
Shuchang Zhou
77
82
0
27 Feb 2023
Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline
Fast-BEV: A Fast and Strong Bird's-Eye View Perception Baseline
Yangguang Li
Bin Huang
Zeren Chen
Yufeng Cui
Feng Liang
...
Fenggang Liu
Enze Xie
Lu Sheng
Wanli Ouyang
Jing Shao
73
43
0
29 Jan 2023
Self-Supervised Image-to-Point Distillation via Semantically Tolerant
  Contrastive Loss
Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss
Anas Mahmoud
Jordan S. K. Hu
Tianshu Kuai
Ali Harakeh
Liam Paull
Steven L. Waslander
3DPC
SSL
58
29
0
12 Jan 2023
Ponder: Point Cloud Pre-training via Neural Rendering
Ponder: Point Cloud Pre-training via Neural Rendering
Di Huang
Sida Peng
Tong He
Honghui Yang
Xiaowei Zhou
Wanli Ouyang
SSL
3DPC
67
41
0
31 Dec 2022
DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention
DETR4D: Direct Multi-View 3D Object Detection with Sparse Attention
Zhipeng Luo
Changqing Zhou
Gongjie Zhang
Shijian Lu
ViT
3DPC
54
20
0
15 Dec 2022
BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud
  Pre-training in Autonomous Driving Scenarios
BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios
Zhiwei Lin
Yongtao Wang
Shengxiang Qi
Nan Dong
Ming-Hsuan Yang
3DPC
38
14
0
12 Dec 2022
GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds
GD-MAE: Generative Decoder for MAE Pre-training on LiDAR Point Clouds
Honghui Yang
Tong He
Jiaheng Liu
Huaguan Chen
Boxi Wu
Binbin Lin
Xiaofei He
Wanli Ouyang
81
61
0
06 Dec 2022
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual
  Information
Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information
Weijie Su
Xizhou Zhu
Chenxin Tao
Lewei Lu
Bin Li
Gao Huang
Yu Qiao
Xiaogang Wang
Jie Zhou
Jifeng Dai
59
41
0
17 Nov 2022
DRAMA: Joint Risk Localization and Captioning in Driving
DRAMA: Joint Risk Localization and Captioning in Driving
Srikanth Malla
Chiho Choi
Isht Dwivedi
Joonhyang Choi
Jiachen Li
129
91
0
22 Sep 2022
Multi-Object Tracking and Segmentation via Neural Message Passing
Multi-Object Tracking and Segmentation via Neural Message Passing
Guillem Brasó
Orcun Cetintas
Laura Leal-Taixe
VOT
53
22
0
15 Jul 2022
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View
  Representation
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Zhijian Liu
Haotian Tang
Alexander Amini
Xinyu Yang
Huizi Mao
Daniela Rus
Song Han
135
892
0
26 May 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian D. Weilbach
Frank Wood
DiffM
BDL
VGen
190
293
0
23 May 2022
READ: Large-Scale Neural Scene Rendering for Autonomous Driving
READ: Large-Scale Neural Scene Rendering for Autonomous Driving
Zhuopeng Li
Lu Li
Zeyu Ma
Ping-jun Zhang
Junbo Chen
Jian-Zong Zhu
48
66
0
11 May 2022
Video Diffusion Models
Video Diffusion Models
Jonathan Ho
Tim Salimans
Alexey A. Gritsenko
William Chan
Mohammad Norouzi
David J. Fleet
DiffM
VGen
140
1,563
0
07 Apr 2022
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera
  Images via Spatiotemporal Transformers
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
Zhiqi Li
Wenhai Wang
Hongyang Li
Enze Xie
Chonghao Sima
Tong Lu
Qiao Yu
Jifeng Dai
98
1,269
0
31 Mar 2022
Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data
Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data
Corentin Sautier
Gilles Puy
Spyros Gidaris
Alexandre Boulch
Andrei Bursuc
Renaud Marlet
3DPC
70
118
0
30 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Masked Autoencoders for Point Cloud Self-supervised Learning
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
Wen Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
55
461
0
13 Mar 2022
PETR: Position Embedding Transformation for Multi-View 3D Object
  Detection
PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Yingfei Liu
Tiancai Wang
Xinming Zhang
Jian Sun
3DPC
88
532
0
10 Mar 2022
Generative Adversarial Networks
Generative Adversarial Networks
Gilad Cohen
Raja Giryes
GAN
143
30,069
0
01 Mar 2022
SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for
  Spatial-Aware Visual Representations
SimIPU: Simple 2D Image and 3D Point Cloud Unsupervised Pre-Training for Spatial-Aware Visual Representations
Zhenyu Li
Zehui Chen
Ang Li
Liangji Fang
Qinhong Jiang
Xianming Liu
Junjun Jiang
Bolei Zhou
Hang Zhao
3DPC
SSL
53
66
0
09 Dec 2021
MonoScene: Monocular 3D Semantic Scene Completion
MonoScene: Monocular 3D Semantic Scene Completion
Anh-Quan Cao
Raoul de Charette
3DV
56
270
0
01 Dec 2021
FILIP: Fine-grained Interactive Language-Image Pre-Training
FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao
Runhu Huang
Lu Hou
Guansong Lu
Minzhe Niu
Hang Xu
Xiaodan Liang
Zhenguo Li
Xin Jiang
Chunjing Xu
VLM
CLIP
72
627
0
09 Nov 2021
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
CLIP-Adapter: Better Vision-Language Models with Feature Adapters
Peng Gao
Shijie Geng
Renrui Zhang
Teli Ma
Rongyao Fang
Yongfeng Zhang
Hongsheng Li
Yu Qiao
VLM
CLIP
199
1,011
0
09 Oct 2021
12
Next