Zhao, Jinghan, Yifei Huang, and Feng Lu. “Learning Procedural-Aware Video Representations Through State-Grounded Hierarchy Unfolding”. Proceedings of the AAAI Conference on Artificial Intelligence 40, no. 16 (March 14, 2026): 13172-13180. Accessed May 1, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/38318.