Li, X., Lan, Y. and Yang, C. (2025) “TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning”, Proceedings of the AAAI Conference on Artificial Intelligence, 39(23), pp. 24485–24493. doi: 10.1609/aaai.v39i23.34627.