[1]
Li, B. et al. 2025. Tri-Ergon: Fine-Grained Video-to-Audio Generation with Multi-Modal Conditions and LUFS Control. Proceedings of the AAAI Conference on Artificial Intelligence. 39, 5 (Apr. 2025), 4616–4624. DOI:https://doi.org/10.1609/aaai.v39i5.32487.