Ye, Q., Zeng, W., Liu, M., Zhang, J., Hu, Y., Yu, Z., & Zhou, Y. (2026). When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion?. Proceedings of the AAAI Conference on Artificial Intelligence, 40(14), 11955-11963. https://doi.org/10.1609/aaai.v40i14.38183