Iizuka, Shinya, et al. “Clarifying the Dialogue-Level Performance of GPT-3.5 and GPT-4 in Task-Oriented and Non-Task-Oriented Dialogue Systems”. Proceedings of the AAAI Symposium Series, vol. 2, no. 1, Jan. 2024, pp. 182-6, doi:10.1609/aaaiss.v2i1.27668.