Gao, F., Li, D., Xia, D., Mi, F., Wang, Y., Shang, L., & Wang, B. (2026). EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing. Proceedings of the AAAI Conference on Artificial Intelligence, 40(44), 37396–37406. https://doi.org/10.1609/aaai.v40i44.41072