TY - JOUR AU - Li, Xin AU - Li, Piji AU - Bi, Wei AU - Liu, Xiaojiang AU - Lam, Wai PY - 2020/04/03 Y2 - 2024/03/28 TI - Relevance-Promoting Language Model for Short-Text Conversation JF - Proceedings of the AAAI Conference on Artificial Intelligence JA - AAAI VL - 34 IS - 05 SE - AAAI Technical Track: Natural Language Processing DO - 10.1609/aaai.v34i05.6340 UR - https://ojs.aaai.org/index.php/AAAI/article/view/6340 SP - 8253-8260 AB - <p>Despite the effectiveness of sequence-to-sequence framework on the task of Short-Text Conversation (STC), the issue of under-exploitation of training data (i.e., the supervision signals from query text is <em>ignored</em>) still remains unresolved. Also, the adopted <em>maximization</em>-based decoding strategies, inclined to generating the generic responses or responses with repetition, are unsuited to the STC task. In this paper, we propose to formulate the STC task as a language modeling problem and tailor-make a training strategy to adapt a language model for response generation. To enhance generation performance, we design a relevance-promoting transformer language model, which performs additional supervised source attention after the self-attention to increase the importance of informative query tokens in calculating the token-level representation. The model further refines the query representation with relevance clues inferred from its multiple references during training. In testing, we adopt a <em>randomization-over-maximization</em> strategy to reduce the generation of generic responses. Experimental results on a large Chinese STC dataset demonstrate the superiority of the proposed model on relevance metrics and diversity metrics.<sup>1</sup></p> ER -