Tu, Yunbin, Liang Li, Li Su, and Qingming Huang. 2025. “Query-Centric Audio-Visual Cognition Network for Moment Retrieval, Segmentation and Step-Captioning”. Proceedings of the AAAI Conference on Artificial Intelligence 39 (7):7464-72. https://doi.org/10.1609/aaai.v39i7.32803.