Song, Wenxuan, et al. “ReconVLA: Reconstructive Vision-Language-Action Model As Effective Robot Perceiver”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 22, Mar. 2026, pp. 18549-57, doi:10.1609/aaai.v40i22.38921.