LING, Run et al. MoFu: Scale-Aware Modulation and Fourier Fusion for Multi-Subject Video Generation. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 40, n. 9, p. 7033–7041, 2026. DOI: 10.1609/aaai.v40i9.37638. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/37638. Acesso em: 25 may. 2026.