[1]
S. Zhang, “Multi-Action Dialog Policy Learning from Logged User Feedback”, AAAI, vol. 37, no. 11, pp. 13976-13983, Jun. 2023.