Zhang, Shuo, Junzhou Zhao, Pinghui Wang, Tianxiang Wang, Zi Liang, Jing Tao, Yi Huang, and Junlan Feng. “Multi-Action Dialog Policy Learning from Logged User Feedback”. Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 11 (June 26, 2023): 13976-13983. Accessed April 16, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/26636.