Xu, Xinrun, Pi Bu, Ye Wang, Börje F. Karlsson, Ziming Wang, Tengtao Song, Qi Zhu, Jun Song, Zhiming Ding, and Bo Zheng. 2026. “DeepPhy: Benchmarking Agentic VLMs on Physical Reasoning”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (40):34160-68. https://doi.org/10.1609/aaai.v40i40.40711.