[1]
G. Li, B. Zhao, J. Yang, and L. Sevilla-Lara, “Mask2IV: Interaction-Centric Video Generation via Mask Trajectories”, AAAI, vol. 40, no. 8, pp. 6091–6099, Mar. 2026.