1.
Zhang S, Zhao J, Wang P, Wang T, Liang Z, Tao J, Huang Y, Feng J. Multi-Action Dialog Policy Learning from Logged User Feedback. AAAI [Internet]. 2023Jun.26 [cited 2024Apr.15];37(11):13976-83. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/26636