A Bayesian Approach for Subset Selection in Contextual Bandits

Authors

  • Jialian Li Tsinghua University
  • Chao Du Alibaba Group
  • Jun Zhu Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v35i9.17019

Keywords:

Online Learning & Bandits

Abstract

Subset selection in Contextual Bandits (CB) is an important task in various applications such as advertisement recommendation. In CB, arms are attached with contexts and thus correlated in the context space. Proper exploration for subset selection in CB should carefully consider the contexts. Previous works mainly concentrate on the best one arm identification in linear bandit problems, where the expected rewards are linearly dependent on the contexts. However, these methods highly rely on linearity, and cannot be easily extended to more general cases. We propose a novel Bayesian approach for subset selection in general CB where the reward functions can be nonlinear. Our method provides a principled way to employ contextual information and efficiently explore the arms. For cases with relatively smooth posteriors, we give theoretical results that are comparable to previous works. For general cases, we provide a calculable approximate variant. Empirical results show the effectiveness of our method on both linear bandits and general CB.

Downloads

Published

2021-05-18

How to Cite

Li, J., Du, C., & Zhu, J. (2021). A Bayesian Approach for Subset Selection in Contextual Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 8384-8391. https://doi.org/10.1609/aaai.v35i9.17019

Issue

Section

AAAI Technical Track on Machine Learning II