Knapsack Based Optimal Policies for Budget–Limited Multi–Armed Bandits

Long Tran-Thanh; Archie Chapman; Alex Rogers; Nicholas Jennings

doi:10.1609/aaai.v26i1.8279

Knapsack Based Optimal Policies for Budget–Limited Multi–Armed Bandits

Authors

Long Tran-Thanh University of Southampton
Archie Chapman The University of Sydney
Alex Rogers University of Southampton
Nicholas Jennings University of Southampton

DOI:

https://doi.org/10.1609/aaai.v26i1.8279

Keywords:

sequential decision making under uncertainty, multi-armed bandits, cost budget

Abstract

In budget–limited multi–armed bandit (MAB) problems, thelearner’s actions are costly and constrained by a fixed budget.Consequently, an optimal exploitation policy may not be topull the optimal arm repeatedly, as is the case in other variantsof MAB, but rather to pull the sequence of different arms thatmaximises the agent’s total reward within the budget. Thisdifference from existing MABs means that new approachesto maximising the total reward are required. Given this, wedevelop two pulling policies, namely: (i) KUBE; and (ii)fractional KUBE. Whereas the former provides better performanceup to 40% in our experimental settings, the latteris computationally less expensive. We also prove logarithmicupper bounds for the regret of both policies, and show thatthese bounds are asymptotically optimal (i.e. they only differfrom the best possible regret by a constant factor).

Downloads

Published

2021-09-20

How to Cite

Tran-Thanh, L., Chapman, A., Rogers, A., & Jennings, N. (2021). Knapsack Based Optimal Policies for Budget–Limited Multi–Armed Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, 26(1), 1134–1140. https://doi.org/10.1609/aaai.v26i1.8279

Download Citation

Issue

Vol. 26 No. 1 (2012): Twenty-Sixth AAAI Conference on Artificial Intelligence

Section

AAAI Technical Track: Machine Learning

Knapsack Based Optimal Policies for Budget–Limited Multi–Armed Bandits

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information