(1)
Li, Z.; Zhou, J.; Zhang, J.; Tang, S.; Li, K.; Guo, D. Patch-Level Sounding Object Tracking for Audio-Visual Question Answering. AAAI 2025, 39, 5075-5083.