[1]

Eriksson, M. et al. 2025. Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 8, 1 (Oct. 2025), 850–864. DOI:https://doi.org/10.1609/aies.v8i1.36595.