Learning to Truncate Ranked Lists for Information Retrieval

Authors

  • Chen Wu Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
  • Ruqing Zhang Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
  • Jiafeng Guo Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
  • Yixing Fan Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
  • Yanyan Lan Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
  • Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences

DOI:

https://doi.org/10.1609/aaai.v35i5.16572

Keywords:

Web Search & Information Retrieval

Abstract

Ranked list truncation is of critical importance in a variety of professional information retrieval applications such as patent search or legal search. The goal is to dynamically determine the number of returned documents according to some user-defined objectives, in order to reach a balance between the overall utility of the results and user efforts. Existing methods formulate this task as a sequential decision problem and take some pre-defined loss as a proxy objective, which suffers from the limitation of local decision and non-direct optimization. In this work, we propose a global decision based truncation model named AttnCut, which directly optimizes user-defined objectives for the ranked list truncation. Specifically, we take the successful transformer architecture to capture the global dependency within the ranked list for truncation decision, and employ the reward augmented maximum likelihood (RAML) for direct optimization. We consider two types of user-defined objectives which are of practical usage. One is the widely adopted metric such as F1 which acts as a balanced objective, and the other is the best F1 under some minimal recall constraint which represents a typical objective in professional search. Empirical results over the Robust04 and MQ2007 datasets demonstrate the effectiveness of our approach as compared with the state-of-the-art baselines.

Downloads

Published

2021-05-18

How to Cite

Wu, C., Zhang, R., Guo, J., Fan, Y., Lan, Y., & Cheng, X. (2021). Learning to Truncate Ranked Lists for Information Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 35(5), 4453–4461. https://doi.org/10.1609/aaai.v35i5.16572

Issue

Section

AAAI Technical Track on Data Mining and Knowledge Management