Iizuka, Shinya, Shota Mochizuki, Atsumoto Ohashi, Sanae Yamashita, Ao Guo, and Ryuichiro Higashinaka. “Clarifying the Dialogue-Level Performance of GPT-3.5 and GPT-4 in Task-Oriented and Non-Task-Oriented Dialogue Systems”. Proceedings of the AAAI Symposium Series 2, no. 1 (January 22, 2024): 182–186. Accessed May 9, 2026. https://ojs.aaai.org/index.php/AAAI-SS/article/view/27668.