1.
Eriksson M, Purificato E, Noroozian A, Vinagre J, Chaslot G, Gomez E, et al. Can We Trust AI Benchmarks? An Interdisciplinary Review of Current Issues in AI Evaluation. AIES [Internet]. 2025 Oct. 15 [cited 2026 May 29];8(1):850-64. Available from: https://ojs.aaai.org/index.php/AIES/article/view/36595