Spectral Thompson Sampling

Authors

  • Tomáš Kocák INRIA Lille - Nord Europe
  • Michal Valko INRIA Lille - Nord Europe
  • Rémi Munos INRIA Lille - Nord Europe and Microsoft Research, New England, USA
  • Shipra Agrawal Microsoft Research, Bangalore

DOI:

https://doi.org/10.1609/aaai.v28i1.9011

Keywords:

spectral bandits, thompson sampling, smooth functions on graphs

Abstract

Thompson Sampling (TS) has surged a lot of interest due to its good empirical performance, in particular in the computational advertising. Though successful, the tools for its performance analysis appeared only recently. In this paper, we describe and analyze SpectralTS algorithm for a bandit problem, where the payoffs of the choices are smooth given an underlying graph. In this setting, each choice is a node of a graph and the expected payoffs of the neighboring nodes are assumed to be similar. Although the setting has application both in recommender systems and advertising, the traditional algorithms would scale poorly with the number of choices. For that purpose we consider an effective dimension d, which is small in real-world graphs. We deliver the analysis showing that the regret of SpectralTS scales as d\sqrt(T \ln N) with high probability, where T is the time horizon and N is the number of choices. Since a d\sqrt(T \ln N) regret is comparable to the known results, SpectralTS offers a computationally more efficient alternative. We also show that our algorithm is competitive on both synthetic and real-world data.

Downloads

Published

2014-06-21

How to Cite

Kocák, T., Valko, M., Munos, R., & Agrawal, S. (2014). Spectral Thompson Sampling. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1). https://doi.org/10.1609/aaai.v28i1.9011

Issue

Section

Main Track: Novel Machine Learning Algorithms