Li, G., Zhao, B., Yang, J., & Sevilla-Lara, L. (2026). Mask2IV: Interaction-Centric Video Generation via Mask Trajectories. Proceedings of the AAAI Conference on Artificial Intelligence, 40(8), 6091–6099. https://doi.org/10.1609/aaai.v40i8.37533