Flexible Budgets in Restless Bandits: A Primal-Dual Algorithm for Efficient Budget Allocation
DOI:
https://doi.org/10.1609/aaai.v37i10.26427Keywords:
PRS: Planning With Markov Models (MDPs, POMDPs), MAS: Multiagent PlanningAbstract
Restless multi-armed bandits (RMABs) are an important model to optimize allocation of limited resources in sequential decision-making settings. Typical RMABs assume the budget --- the number of arms pulled --- to be fixed for each step in the planning horizon. However, for realistic real-world planning, resources are not necessarily limited at each planning step; we may be able to distribute surplus resources in one round to an earlier or later round. In real-world planning settings, this flexibility in budget is often constrained to within a subset of consecutive planning steps, e.g., weekly planning of a monthly budget. In this paper we define a general class of RMABs with flexible budget, which we term F-RMABs, and provide an algorithm to optimally solve for them. We derive a min-max formulation to find optimal policies for F-RMABs and leverage gradient primal-dual algorithms to solve for reward-maximizing policies with flexible budgets. We introduce a scheme to sample expected gradients to apply primal-dual algorithms to the F-RMAB setting and make an otherwise computationally expensive approach tractable. Additionally, we provide heuristics that trade off solution quality for efficiency and present experimental comparisons of different F-RMAB solution approaches.Downloads
Published
2023-06-26
How to Cite
Rodriguez Diaz, P., Killian, J. A., Xu, L., Suggala, A. S., Taneja, A., & Tambe, M. (2023). Flexible Budgets in Restless Bandits: A Primal-Dual Algorithm for Efficient Budget Allocation. Proceedings of the AAAI Conference on Artificial Intelligence, 37(10), 12103-12111. https://doi.org/10.1609/aaai.v37i10.26427
Issue
Section
AAAI Technical Track on Planning, Routing, and Scheduling