1.
Li Y, Guerin F, Lin C. LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time-Sensitive Test Construction. AAAI [Internet]. 2024 Mar. 24 [cited 2026 May 27];38(17):18600-7. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/29822