(1)
Zhao, Y.; Wang, Z.; Huang, Z. Automatic Curriculum Learning With Over-Repetition Penalty for Dialogue Policy Learning. AAAI 2021, 35, 14540-14548.