A Unified Pretraining Framework for Passage Ranking and Expansion

Authors

  • Ming Yan Alibaba Group
  • Chenliang Li Alibaba Group
  • Bin Bi Alibaba Group
  • Wei Wang Alibaba Group
  • Songfang Huang Alibaba Group

DOI:

https://doi.org/10.1609/aaai.v35i5.16584

Keywords:

Web Search & Information Retrieval

Abstract

Pretrained language models have recently advanced a wide range of natural language processing tasks. Nowadays, the application of pretrained language models to IR tasks has also achieved impressive results. Typical methods either directly apply a pretrained model to improve the re-ranking stage, or use it to conduct passage expansion and term weighting for first-stage retrieval. We observe that the passage ranking and passage expansion tasks share certain inherent relations, and can benefit from each other. Therefore, in this paper, we propose a general pretraining framework to enhance both tasks with Unified Encoder-Decoder networks (UED). The overall ranking framework consists of two parts in a cascade manner: (1) passage expansion with a pretraining-based query generation method; (2) re-ranking of passage candidates from a traditional retrieval method with a pretrained transformer encoder. Both the two parts are based on the same pretrained UED model, where we jointly train the passage ranking and query generation tasks for further improving the full ranking pipeline. An extensive set of experiments have been conducted on two large-scale passage retrieval datasets to demonstrate the state-of-the-art results of the proposed framework in both the first-stage retrieval and the final re-ranking. In addition, we successfully deploy the framework to our online production system, which can stably serve industrial applications with a request volume of up to 100 QPS in less than 300ms.

Downloads

Published

2021-05-18

How to Cite

Yan, M., Li, C., Bi, B., Wang, W., & Huang, S. (2021). A Unified Pretraining Framework for Passage Ranking and Expansion. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4555-4563. https://doi.org/10.1609/aaai.v35i5.16584

Issue

Section

AAAI Technical Track on Data Mining and Knowledge Management