SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control

Authors

  • Binyuan Huang Wuhan University
  • Yuqing Wen University of Science and Technology of China
  • Yucheng Zhao Megvii Technology Inc.
  • Yaosi Hu Hong Kong Polytechnic University
  • Yingfei Liu Megvii Technology Inc.
  • Fan Jia Megvii Technology Inc.
  • Weixin Mao Megvii Technology Inc.
  • Tiancai Wang Megvii Technology Inc.
  • Chi Zhang Mach Drive
  • Chang Wen Chen Hong Kong Polytechnic University
  • Zhenzhong Chen Wuhan University
  • Xiangyu Zhang Megvii Technology Inc.

DOI:

https://doi.org/10.1609/aaai.v39i4.32376

Abstract

Autonomous driving progress relies on large-scale annotated datasets. In this work, we explore the potential of generative models to produce vast quantities of freely-labeled data for autonomous driving applications and present SubjectDrive, the first model proven to scale generative data production in a way that could continuously improve autonomous driving applications. We investigate the impact of scaling up the quantity of generative data on the performance of downstream perception models and find that enhancing data diversity plays a crucial role in effectively scaling generative data production. Therefore, we have developed a novel model equipped with a subject control mechanism, which allows the generative model to leverage diverse external data sources for producing varied and useful data. Extensive evaluations confirm SubjectDrive's efficacy in generating scalable autonomous driving training data, marking a significant step toward revolutionizing data production methods in this field.

Downloads

Published

2025-04-11

How to Cite

Huang, B., Wen, Y., Zhao, Y., Hu, Y., Liu, Y., Jia, F., Mao, W., Wang, T., Zhang, C., Chen, C. W., Chen, Z., & Zhang, X. (2025). SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject Control. Proceedings of the AAAI Conference on Artificial Intelligence, 39(4), 3617-3625. https://doi.org/10.1609/aaai.v39i4.32376

Issue

Section

AAAI Technical Track on Computer Vision III