Bhosale, S., Yang, H., Kanojia, D., Deng, J., & Zhu, X. (2025). Unsupervised Audio-Visual Segmentation with Modality Alignment. Proceedings of the AAAI Conference on Artificial Intelligence, 39(15), 15567–15575. https://doi.org/10.1609/aaai.v39i15.33709