Li, Xiang, Yunshi Lan, and Chao Yang. 2025. “TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning”. Proceedings of the AAAI Conference on Artificial Intelligence 39 (23):24485-93. https://doi.org/10.1609/aaai.v39i23.34627.