Prospective Role of Foundation Models in Advancing Autonomous Vehicles

Wu, Jianhua; Gao, Bingzhao; Gao, Jincheng; Yu, Jianhao; Chu, Hongqing; Yu, Qiankun; Gong, Xun; Chang, Yi; Tseng, H. Eric; Chen, Hong; Chen, Jie

Computer Science > Computer Vision and Pattern Recognition

arXiv:2405.02288v2 (cs)

[Submitted on 8 Dec 2023 (v1), last revised 17 May 2024 (this version, v2)]

Title:Prospective Role of Foundation Models in Advancing Autonomous Vehicles

Authors:Jianhua Wu, Bingzhao Gao, Jincheng Gao, Jianhao Yu, Hongqing Chu, Qiankun Yu, Xun Gong, Yi Chang, H. Eric Tseng, Hong Chen, Jie Chen

View PDF HTML (experimental)

Abstract:With the development of artificial intelligence and breakthroughs in deep learning, large-scale Foundation Models (FMs), such as GPT, Sora, etc., have achieved remarkable results in many fields including natural language processing and computer vision. The application of FMs in autonomous driving holds considerable promise. For example, they can contribute to enhancing scene understanding and reasoning. By pre-training on rich linguistic and visual data, FMs can understand and interpret various elements in a driving scene, and provide cognitive reasoning to give linguistic and action instructions for driving decisions and planning. Furthermore, FMs can augment data based on the understanding of driving scenarios to provide feasible scenes of those rare occurrences in the long tail distribution that are unlikely to be encountered during routine driving and data collection. The enhancement can subsequently lead to improvement in the accuracy and reliability of autonomous driving systems. Another testament to the potential of FMs' applications lies in World Models, exemplified by the DREAMER series, which showcases the ability to comprehend physical laws and dynamics. Learning from massive data under the paradigm of self-supervised learning, World Model can generate unseen yet plausible driving environments, facilitating the enhancement in the prediction of road users' behaviors and the off-line training of driving strategies. In this paper, we synthesize the applications and future trends of FMs in autonomous driving. By utilizing the powerful capabilities of FMs, we strive to tackle the potential issues stemming from the long-tail distribution in autonomous driving, consequently advancing overall safety in this domain.

Comments:	45 pages,8 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:2405.02288 [cs.CV]
	(or arXiv:2405.02288v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2405.02288

Submission history

From: Jianhua Wu [view email]
[v1] Fri, 8 Dec 2023 15:35:24 UTC (1,176 KB)
[v2] Fri, 17 May 2024 10:47:50 UTC (1,451 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Prospective Role of Foundation Models in Advancing Autonomous Vehicles

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Prospective Role of Foundation Models in Advancing Autonomous Vehicles

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators