Improved Text Matching by Enhancing Mutual Information
Keywords:LambdaRank, Question Rewrite
Text matching is a core issue for question answering (QA), information retrieval (IR) and many other fields. We propose to reformulate the original text, i.e., generating a new text that is semantically equivalent to original text, to improve text matching degree. Intuitively, the generated text improves mutual information between two text sequences. We employ the generative adversarial network as the reformulation model where there is a discriminator to guide the text generating process. In this work, we focus on matching question and answers. The task is to rank answers based on QA matching degree. We first reformulate the original question without changing the asker's intent, then compute a relevance score for each answer. To evaluate the method, we collected questions and answers from Zhihu. In addition, we also conduct substantial experiments on public data such as SemEval and WikiQA to compare our method with existing methods. Experimental results demonstrate that after adding the reformulated question, the ranking performance across different matching models can be improved consistently, indicating that the reformulated question has enhanced mutual information and effectively bridged the semantic gap between QA.