End-to-End Autonomous Driving Through V2X Cooperation

Authors

  • Haibao Yu The University of Hong Kong AIR, Tsinghua University
  • Wenxian Yang AIR, Tsinghua University
  • Jiaru Zhong AIR, Tsinghua University Beijing Institute of Technology
  • Zhenwei Yang AIR, Tsinghua University University of Science and Technology Beijing
  • Siqi Fan AIR, Tsinghua University
  • Ping Luo The University of Hong Kong
  • Zaiqing Nie AIR, Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v39i9.33040

Abstract

Cooperatively utilizing both ego-vehicle and infrastructure sensor data via V2X communication has emerged as a promising approach for advanced autonomous driving. However, current research mainly focuses on improving individual modules, rather than taking end-to-end learning to optimize final planning performance, resulting in underutilized data potential. In this paper, we introduce UniV2X, a pioneering cooperative autonomous driving framework that seamlessly integrates all key driving modules across diverse views into a unified network. We propose a sparse-dense hybrid data transmission and fusion mechanism for effective vehicle-infrastructure cooperation, offering three advantages: 1) Effective for simultaneously enhancing agent perception, online mapping, and occupancy prediction, ultimately improving planning performance. 2) Transmission-friendly for practical and limited communication conditions. 3) Reliable data fusion with interpretability of this hybrid data. We implement UniV2X, as well as reproducing several benchmark methods, on the challenging DAIR-V2X, the real-world cooperative driving dataset. Experimental results demonstrate the effectiveness of UniV2X in significantly enhancing planning performance, as well as all intermediate output performance.

Published

2025-04-11

How to Cite

Yu, H., Yang, W., Zhong, J., Yang, Z., Fan, S., Luo, P., & Nie, Z. (2025). End-to-End Autonomous Driving Through V2X Cooperation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(9), 9598–9606. https://doi.org/10.1609/aaai.v39i9.33040

Issue

Section

AAAI Technical Track on Computer Vision VIII