Personalizing a Dialogue System With Transfer Reinforcement Learning

Authors

  • Kaixiang Mo Hong Kong University of Science and Technology
  • Yu Zhang Hong Kong University of Science and Technology
  • Shuangyin Li Hong Kong University of Science and Technology
  • Jiajun Li Hong Kong University of Science and Technology
  • Qiang Yang Hong Kong University of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v32i1.11938

Keywords:

transfer learning, reinforcement learning, task-oriented dialogue system, personalized dialogue system, transfer reinforcement learning, multi-turn dialogue system

Abstract

It is difficult to train a personalized task-oriented dialogue system because the data collected from each individual is often insufficient. Personalized dialogue systems trained on a small dataset is likely to overfit and make it difficult to adapt to different user needs. One way to solve this problem is to consider a collection of multiple users as a source domain and an individual user as a target domain, and to perform transfer learning from the source domain to the target domain. By following this idea, we propose a PErsonalized Task-oriented diALogue (PETAL) system, a transfer reinforcement learning framework based on POMDP, to construct a personalized dialogue system. The PETAL system first learns common dialogue knowledge from the source domain and then adapts this knowledge to the target domain. The proposed PETAL system can avoid the negative transfer problem by considering differences between the source and target users in a personalized Q-function. Experimental results on a real-world coffee-shopping data and simulation data show that the proposed PETAL system can learn optimal policies for different users, and thus effectively improve the dialogue quality under the personalized setting.

Downloads

Published

2018-04-27

How to Cite

Mo, K., Zhang, Y., Li, S., Li, J., & Yang, Q. (2018). Personalizing a Dialogue System With Transfer Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11938