UniMapGen: A Generative Framework for Large-Scale Map Construction from Multi-modal Data

Authors

  • Yujian Yuan Amap, Alibaba Group The Hong Kong University of Science and Technology
  • Changjie Wu Amap, Alibaba Group
  • Xinyuan Chang Amap, Alibaba Group
  • Sijin Wang Amap, Alibaba Group
  • Hang Zhang Amap, Alibaba Group
  • Shiyi Liang Amap, Alibaba Group Xi'an Jiaotong University
  • Shuang Zeng Amap, Alibaba Group Xi'an Jiaotong University
  • Mu Xu Amap, Alibaba Group

DOI:

https://doi.org/10.1609/aaai.v40i15.38219

Abstract

Large-scale map construction is foundational for critical applications such as autonomous driving and navigation systems. Traditional large-scale map construction approaches mainly rely on costly and inefficient special data collection vehicles and labor-intensive annotation processes. While existing satellite-based methods have demonstrated promising potential in enhancing the efficiency and coverage of map construction, they exhibit two major limitations: (1) inherent drawbacks of satellite data (e.g., occlusions, outdatedness) and (2) inefficient vectorization from perception-based methods, resulting in discontinuous and rough roads that require extensive post-processing. This paper presents a novel generative framework, UniMapGen, for large-scale map construction, offering three key innovations: (1) representing lane lines as discrete sequence and establishing an iterative strategy to generate more complete and smooth map vectors than traditional perception-based methods. (2) proposing a flexible architecture that supports multi-modal inputs, enabling dynamic selection among BEV, PV, and text prompt, to overcome the drawbacks of satellite data. (3) developing a state update strategy for global continuity and consistency of the constructed large-scale map. UniMapGen achieves state-of-the-art performance on the OpenSatMap dataset. Furthermore, UniMapGen can infer occluded roads and predict roads missing from dataset annotations.

Published

2026-03-14

How to Cite

Yuan, Y., Wu, C., Chang, X., Wang, S., Zhang, H., Liang, S., … Xu, M. (2026). UniMapGen: A Generative Framework for Large-Scale Map Construction from Multi-modal Data. Proceedings of the AAAI Conference on Artificial Intelligence, 40(15), 12277–12285. https://doi.org/10.1609/aaai.v40i15.38219

Issue

Section

AAAI Technical Track on Computer Vision XII