A Bayesian Approach for Subset Selection in Contextual Bandits


  • Jialian Li Tsinghua University
  • Chao Du Alibaba Group
  • Jun Zhu Tsinghua University


Online Learning & Bandits


Subset selection in Contextual Bandits (CB) is an important task in various applications such as advertisement recommendation. In CB, arms are attached with contexts and thus correlated in the context space. Proper exploration for subset selection in CB should carefully consider the contexts. Previous works mainly concentrate on the best one arm identification in linear bandit problems, where the expected rewards are linearly dependent on the contexts. However, these methods highly rely on linearity, and cannot be easily extended to more general cases. We propose a novel Bayesian approach for subset selection in general CB where the reward functions can be nonlinear. Our method provides a principled way to employ contextual information and efficiently explore the arms. For cases with relatively smooth posteriors, we give theoretical results that are comparable to previous works. For general cases, we provide a calculable approximate variant. Empirical results show the effectiveness of our method on both linear bandits and general CB.




How to Cite

Li, J., Du, C., & Zhu, J. (2021). A Bayesian Approach for Subset Selection in Contextual Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, 35(9), 8384-8391. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17019



AAAI Technical Track on Machine Learning II