Learning to Truncate Ranked Lists for Information Retrieval

Chen Wu; Ruqing Zhang; Jiafeng Guo; Yixing Fan; Yanyan Lan; Xueqi Cheng

doi:10.1609/aaai.v35i5.16572

Authors

Chen Wu Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Ruqing Zhang Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Jiafeng Guo Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Yixing Fan Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Yanyan Lan Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v35i5.16572

Keywords:

Web Search & Information Retrieval

Abstract

Ranked list truncation is of critical importance in a variety of professional information retrieval applications such as patent search or legal search. The goal is to dynamically determine the number of returned documents according to some user-defined objectives, in order to reach a balance between the overall utility of the results and user efforts. Existing methods formulate this task as a sequential decision problem and take some pre-defined loss as a proxy objective, which suffers from the limitation of local decision and non-direct optimization. In this work, we propose a global decision based truncation model named AttnCut, which directly optimizes user-defined objectives for the ranked list truncation. Specifically, we take the successful transformer architecture to capture the global dependency within the ranked list for truncation decision, and employ the reward augmented maximum likelihood (RAML) for direct optimization. We consider two types of user-defined objectives which are of practical usage. One is the widely adopted metric such as F1 which acts as a balanced objective, and the other is the best F1 under some minimal recall constraint which represents a typical objective in professional search. Empirical results over the Robust04 and MQ2007 datasets demonstrate the effectiveness of our approach as compared with the state-of-the-art baselines.

Learning to Truncate Ranked Lists for Information Retrieval

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information