A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation
DOI:
https://doi.org/10.1609/aaai.v37i11.26556Keywords:
SNLP: GenerationAbstract
Endowing dialogue agents with personas is the key to delivering more human-like conversations. However, existing persona-grounded dialogue systems still lack informative details of human conversations and tend to reply with inconsistent and generic responses. One of the main underlying causes is that pre-defined persona sentences are generally short and merely superficial descriptions of personal attributes, making appropriate persona selection and understanding non-trivial. Another challenge is that it is crucial to consider the context and the conversation flow to dynamically determine when to invoke different types of persona signals. To address these problems, we propose a disentangled-attention based pre-training architecture, which incorporates persona-aware prompt learning to bridge the connection between the selected persona and response generation. Our model first exploits the conversation flow to select context-relevant personas, and subsequently enriches the superficial persona descriptions with extra personality traits through persona-aware prompting. Finally, the decoder leverages a disentangled-attention mechanism to flexibly control the reliance on personas and dialogue contexts, and incorporates A*-like keyword-based heuristic estimates for controllable generation. Extensive experiments show that our approach can outperform strong baselines and deliver more consistent and engaging responses on the PERSONA-CHAT dataset.Downloads
Published
2023-06-26
How to Cite
Liu, P., Huang, Z., Zhang, X., Wang, L., de Melo, G., Lin, X., Pang, L., & He, L. (2023). A Disentangled-Attention Based Framework with Persona-Aware Prompt Learning for Dialogue Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 13255-13263. https://doi.org/10.1609/aaai.v37i11.26556
Issue
Section
AAAI Technical Track on Speech & Natural Language Processing