[1]
L. Xie, “MAVERIX: Multimodal Audio-Visual Evaluation and Recognition IndeX”, AAAI, vol. 40, no. 32, pp. 27090–27098, Mar. 2026.