Ruan, J., Jiang, D., Gao, X., Liu, T., Fu, Y., & Kang, Y. (2026). MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(11), 8760–8768. https://doi.org/10.1609/aaai.v40i11.37829