Li, H., Yang, S., Chen, Y., Chen, X., Yang, X., Tian, Y., … Pang, J. (2026). Towards Efficient and Robust Manipulation via Multi-Frame Vision-Language-Action Modeling. Proceedings of the AAAI Conference on Artificial Intelligence, 40(22), 18388–18396. https://doi.org/10.1609/aaai.v40i22.38903