Ruan, J. (2026) “MME-SCI: A Comprehensive and Challenging Science Benchmark for Multimodal Large Language Models”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(11), pp. 8760–8768. doi: 10.1609/aaai.v40i11.37829.