(1)
Li, X.; Lan, Y.; Yang, C. TreeEval: Benchmark-Free Evaluation of Large Language Models through Tree Planning. AAAI 2025, 39, 24485-24493.