Cross-Validated Off-Policy Evaluation

Authors

  • Matej Cief Brno University of Technology Kempelen Institute of Intelligent Technologies
  • Branislav Kveton Adobe Research
  • Michal Kompan Kempelen Institute of Intelligent Technologies

DOI:

https://doi.org/10.1609/aaai.v39i15.33765

Abstract

We study estimator selection and hyper-parameter tuning in off-policy evaluation. Although cross-validation is the most popular method for model selection in supervised learning, off-policy evaluation relies mostly on theory, which provides only limited guidance to practitioners. We show how to use cross-validation for off-policy evaluation. This challenges a popular belief that cross-validation in off-policy evaluation is not feasible. We evaluate our method empirically and show that it addresses a variety of use cases.

Downloads

Published

2025-04-11

How to Cite

Cief, M., Kveton, B., & Kompan, M. (2025). Cross-Validated Off-Policy Evaluation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(15), 16073–16081. https://doi.org/10.1609/aaai.v39i15.33765

Issue

Section

AAAI Technical Track on Machine Learning I