A Change-Detection Based Framework for Piecewise-Stationary Multi-Armed Bandit Problem

Fang Liu; Joohyun Lee; Ness Shroff

doi:10.1609/aaai.v32i1.11746

A Change-Detection Based Framework for Piecewise-Stationary Multi-Armed Bandit Problem

Authors

Fang Liu The Ohio State University
Joohyun Lee The Ohio State University
Ness Shroff The Ohio State University

DOI:

https://doi.org/10.1609/aaai.v32i1.11746

Keywords:

Multi-armed bandits, Online learning, Change detection

Abstract

The multi-armed bandit problem has been extensively studied under the stationary assumption. However in reality, this assumption often does not hold because the distributions of rewards themselves may change over time. In this paper, we propose a change-detection (CD) based framework for multi-armed bandit problems under the piecewise-stationary setting, and study a class of change-detection based UCB (Upper Confidence Bound) policies, CD-UCB, that actively detects change points and restarts the UCB indices. We then develop CUSUM-UCB and PHT-UCB, that belong to the CD-UCB class and use cumulative sum (CUSUM) and Page-Hinkley Test (PHT) to detect changes. We show that CUSUM-UCB obtains the best known regret upper bound under mild assumptions. We also demonstrate the regret reduction of the CD-UCB policies over arbitrary Bernoulli rewards and Yahoo! datasets of webpage click-through rates.

Downloads

Published

2018-04-29

How to Cite

Liu, F., Lee, J., & Shroff, N. (2018). A Change-Detection Based Framework for Piecewise-Stationary Multi-Armed Bandit Problem. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11746

Download Citation

Issue

Vol. 32 No. 1 (2018): Thirty-Second AAAI Conference on Artificial Intelligence

Section

AAAI Technical Track: Machine Learning

A Change-Detection Based Framework for Piecewise-Stationary Multi-Armed Bandit Problem

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information