1.
Lu Z, Geng T, Chen Y, Wang T, Lu P, Zheng F. R-AVST: Empowering Video-LLMs with Fine-Grained Spatio-Temporal Reasoning in Complex Audio-Visual Scenarios. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 13];40(9):7627-35. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/37704