[1]

Ji, S. et al. 2026. Diff-V2M: A Hierarchical Conditional Diffusion Model with Explicit Rhythmic Modeling for Video-to-Music Generation. Proceedings of the AAAI Conference on Artificial Intelligence. 40, 26 (Mar. 2026), 22219–22227. DOI:https://doi.org/10.1609/aaai.v40i26.39378.