On Interruptible Pure Exploration in Multi-Armed Bandits

Alexander Shleyfman; Antonín Komenda; Carmel Domshlak

doi:10.1609/aaai.v29i1.9688

On Interruptible Pure Exploration in Multi-Armed Bandits

Authors

Alexander Shleyfman Technion – Israel Institute of Technology
Antonín Komenda Czech Technical University in Prague
Carmel Domshlak Technion – Israel Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v29i1.9688

Keywords:

Multi-Armed Bandit, Monte-Carlo, Search

Abstract

Interruptible pure exploration in multi-armed bandits (MABs) is a key component of Monte-Carlo tree search algorithms for sequential decision problems. We introduce Discriminative Bucketing (DB), a novel family of strategies for pure exploration in MABs, which allows for adapting recent advances in non-interruptible strategies to the interruptible setting, while guaranteeing exponential-rate performance improvement over time. Our experimental evaluation demonstrates that the corresponding instances of DB favorably compete both with the currently popular strategies UCB1 and Epsilon-Greedy, as well as with the conservative uniform sampling.

Downloads

Published

2015-03-04

How to Cite

Shleyfman, A., Komenda, A., & Domshlak, C. (2015). On Interruptible Pure Exploration in Multi-Armed Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9688

Download Citation

Issue

Vol. 29 No. 1 (2015): Twenty-Ninth AAAI Conference on Artificial Intelligence

Section

AAAI Technical Track: Reasoning under Uncertainty

On Interruptible Pure Exploration in Multi-Armed Bandits

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription