[1]
T. Zhang, H. Duan, H. Hao, Y. Qiao, J. Dai, and Z. Hou, “Grounding Actions in Camera Space: Observation-Centric Vision-Language-Action Policy”, AAAI, vol. 40, no. 22, pp. 18782–18790, Mar. 2026.