1.
Huang C, Tang L, Zhan Z, Yu L, Zeng R, Liu Z, et al. UNeMo: Collaborative Visual-Language Reasoning and Navigation via a Multimodal World Model. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 11];40(22):18315-23. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/38895