Zhao, Pengcheng, Jinxing Zhou, Yang Zhao, Dan Guo, and Yanxiang Chen. “Multimodal Class-Aware Semantic Enhancement Network for Audio-Visual Video Parsing”. Proceedings of the AAAI Conference on Artificial Intelligence 39, no. 10 (April 11, 2025): 10448–10456. Accessed May 16, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/33134.