[1]
R. Ling, “MoFu: Scale-Aware Modulation and Fourier Fusion for Multi-Subject Video Generation”, AAAI, vol. 40, no. 9, pp. 7033–7041, Mar. 2026.