Infinitely Many-Armed Bandits with Budget Constraints

Authors

  • Haifang Li Institute of Automation, Chinese Academy of Sciences
  • Yingce Xia University of Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v31i1.10881

Keywords:

Budgeted Multi-Armed Bandits, Infinitely Many Arms, Regret Analysis

Abstract

We study the infinitely many-armed bandit problem with budget constraints, where the number of arms can be infinite and much larger than the number of possible experiments. The player aims at maximizing his/her total expected reward under a budget constraint B for the cost of pulling arms. We introduce a weak stochastic assumption on the ratio of expected-reward to expected-cost of a newly pulled arm which characterizes its probability of being a near-optimal arm. We propose an algorithm named RCB-I to this new problem, in which the player first randomly picks K arms, whose order is sub-linear in terms of B, and then runs the algorithm for the finite-arm setting on the selected arms. Theoretical analysis shows that this simple algorithm enjoys a sub-linear regret in term of the budget B. We also provide a lower bound of any algorithm under Bernoulli setting. The regret bound of RCB-I matches the lower bound up to a logarithmic factor. We further extend this algorithm to the any-budget setting (i.e., the budget is unknown in advance) and conduct corresponding theoretical analysis.

Downloads

Published

2017-02-13

How to Cite

Li, H., & Xia, Y. (2017). Infinitely Many-Armed Bandits with Budget Constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.10881