1.
Zhang C, D’Haro LF, Chen Y, Zhang M, Li H. A Comprehensive Analysis of the Effectiveness of Large Language Models as Automatic Dialogue Evaluators. AAAI [Internet]. 2024Mar.24 [cited 2024Jul.16];38(17):19515-24. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/29923