Response Enhanced Semi-supervised Dialogue Query Generation

Jianheng Huang; Ante Wang; Linfeng Gao; Linfeng Song; Jinsong Su

doi:10.1609/aaai.v38i16.29790

Authors

Jianheng Huang School of Informatics, Xiamen University, China Shanghai Artificial Intelligence Laboratory, China Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China
Ante Wang School of Informatics, Xiamen University, China Shanghai Artificial Intelligence Laboratory, China Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China
Linfeng Gao School of Informatics, Xiamen University, China Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China
Linfeng Song Tencent AI Lab
Jinsong Su School of Informatics, Xiamen University, China Shanghai Artificial Intelligence Laboratory, China Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China

DOI:

https://doi.org/10.1609/aaai.v38i16.29790

Keywords:

NLP: Conversational AI/Dialog Systems, NLP: Generation

Abstract

Leveraging vast and continually updated knowledge from the Internet has been considered an important ability for a dialogue system. Therefore, the dialogue query generation task is proposed for generating search queries from dialogue histories, which will be submitted to a search engine for retrieving relevant websites on the Internet. In this regard, previous efforts were devoted to collecting conversations with annotated queries and training a query producer (QP) via standard supervised learning. However, these studies still face the challenges of data scarcity and domain adaptation. To address these issues, in this paper, we propose a semi-supervised learning framework -- SemiDQG, to improve model performance with unlabeled conversations. Based on the observation that the search query is typically related to the topic of dialogue response, we train a response-augmented query producer (RA) to provide rich and effective training signals for QP. We first apply a similarity-based query selection strategy to select high-quality RA-generated pseudo queries, which are used to construct pseudo instances for training QP and RA. Then, we adopt the REINFORCE algorithm to further enhance QP, with RA-provided rewards as fine-grained training signals. Experimental results and in-depth analysis of three benchmarks show the effectiveness of our framework in cross-domain and low-resource scenarios. Particularly, SemiDQG significantly surpasses ChatGPT and competitive baselines. Our code is available at \url{https://github.com/DeepLearnXMU/SemiDQG}.

Response Enhanced Semi-supervised Dialogue Query Generation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information