Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning

Authors

  • Yue Li University of Science and Technology of China
  • Meng Tian Yinwang Intelligent Technology Co. Ltd.
  • Dechang Zhu Yinwang Intelligent Technology Co. Ltd.
  • Jiangtong Zhu Yinwang Intelligent Technology Co. Ltd.
  • Zhenyu Lin Huawei Technologies Ltd.
  • Zhiwei Xiong University of Science and Technology of China
  • Xinhai Zhao Huawei Technologies Ltd.

DOI:

https://doi.org/10.1609/aaai.v40i8.37602

Abstract

Large vision-language models (VLMs) for autonomous driving (AD) are evolving beyond perception and cognition tasks toward motion planning. However, we identify two critical challenges in this direction: (1) VLMs tend to learn shortcuts by relying heavily on history input information, achieving seemingly strong planning results without genuinely understanding the visual inputs; and (2) the chain-of-thought (COT) reasoning processes are always misaligned with the motion planning outcomes, and how to effectively leverage the complex reasoning capability to enhance planning remains largely underexplored. In this paper, we start from a small-scale domain-specific VLM and propose Drive-R1, designed to bridge the scenario reasoning and motion planning for AD. Drive-R1 first undergoes the supervised finetuning on an elaborate dataset containing both long and short COT data. Drive-R1 is encouraged to reason step-by-step from visual input to final planning decisions. Subsequently, Drive-R1 is trained within a reinforcement learning framework that incentivizes the discovery of reasoning paths that are more informative for planning, guided by rewards based on predicted trajectories and meta actions. Experimental evaluations on the nuScenes and DriveLM-nuScenes benchmarks demonstrate that Drive-R1 achieves superior performance compared to existing state-of-the-art VLMs. We believe that Drive-R1 presents a promising direction for bridging reasoning and planning in AD, offering methodological insights for future research and applications.

Downloads

Published

2026-03-14

How to Cite

Li, Y., Tian, M., Zhu, D., Zhu, J., Lin, Z., Xiong, Z., & Zhao, X. (2026). Drive-R1: Bridging Reasoning and Planning in VLMs for Autonomous Driving with Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 40(8), 6708–6716. https://doi.org/10.1609/aaai.v40i8.37602

Issue

Section

AAAI Technical Track on Computer Vision V