Huang, C., Tang, L., Zhan, Z., Yu, L., Zeng, R., Liu, Z., … Li, J. (2026). UNeMo: Collaborative Visual-Language Reasoning and Navigation via a Multimodal World Model. Proceedings of the AAAI Conference on Artificial Intelligence, 40(22), 18315–18323. https://doi.org/10.1609/aaai.v40i22.38895