SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries

Authors

  • Chenxu Dang Huazhong University of Science and Technology Institute for AI Industry Research (AIR), Tsinghua University
  • Haiyan Liu Lenovo Group Limited
  • Jason Bao Lenovo Group Limited
  • Pei An Huazhong University of Science and Technology
  • Xinyue Tang Lenovo Group Limited
  • An Pan AIR Wuxi Innovation Center, Tsinghua University (AIRIC)
  • Jie Ma Huazhong University of Science and Technology
  • Bingchuan Sun Lenovo Group Limited
  • Yan Wang Institute for AI Industry Research (AIR), Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v40i5.37347

Abstract

Semantic occupancy has emerged as a powerful representation in world models for its ability to capture rich spatial semantics. However, most existing occupancy world models rely on static and fixed embeddings or grids, which inherently limit the flexibility of perception. Moreover, their ``in-place classification" over grids exhibits a potential misalignment with the dynamic and continuous nature of real scenarios. In this paper, we propose SparseWorld, a novel 4D occupancy world model that is flexible, adaptive, and efficient, powered by sparse and dynamic queries. We propose a Range-Adaptive Perception module, in which learnable queries are modulated by the ego vehicle states and enriched with temporal-spatial associations to enable extended-range perception. To effectively capture the dynamics of the scene, we design a State-Conditioned Forecasting module, which replaces classification-based forecasting with regression-guided formulation, precisely aligning the dynamic queries with the continuity of the 4D environment. In addition, We specifically devise a Temporal-Aware Self-Scheduling training strategy to enable smooth and efficient training. Extensive experiments demonstrate that SparseWorld achieves state-of-the-art performance across perception, forecasting, and planning tasks. Comprehensive visualizations and ablation studies further validate the advantages of SparseWorld in terms of flexibility, adaptability, and efficiency.

Downloads

Published

2026-03-14

How to Cite

Dang, C., Liu, H., Bao, J., An, P., Tang, X., Pan, A., … Wang, Y. (2026). SparseWorld: A Flexible, Adaptive, and Efficient 4D Occupancy World Model Powered by Sparse and Dynamic Queries. Proceedings of the AAAI Conference on Artificial Intelligence, 40(5), 3497–3505. https://doi.org/10.1609/aaai.v40i5.37347

Issue

Section

AAAI Technical Track on Computer Vision II