Quantum Best Arm Identification with Quantum Oracles

Authors

  • Xuchuang Wang University of Massachusetts, Amherst, MA
  • Yu-Zhen Janice Chen University of Massachusetts, Amherst, MA
  • Matheus Guedes de Andrade University of Massachusetts, Amherst, MA
  • Jonathan Allcock Tencent Quantum Lab
  • Mohammad Hajiesmaili University of Massachusetts, Amherst, MA
  • John C.S. Lui Chinese University of Hong Kong
  • Don Towsley University of Massachusetts, Amherst, MA

DOI:

https://doi.org/10.1609/aaai.v39i20.35432

Abstract

Best arm identification (BAI) is a key problem in stochastic multi-armed bandits, where K arms each has an associated reward distribution, and the objective is to minimize the number of queries needed to identify the best arm with high confidence. In this paper, we explore BAI using quantum oracles. For the case where each query probes only one arm (m=1), we devise a quantum algorithm with a query complexity upper bound of O((K/Delta)log(1/delta)), where delta is the confidence parameter and Delta is the reward gap between best and second best arms. This improves on the classical bound by a factor of 1/Delta. For the general case where a single query can probe m arms (1<= m<= K) simultaneously, we propose an algorithm with an upper bound of O((K/(Delta x sqrt(m))) log(1/delta)), improving by a factor of sqrt(m) compared to the m=1 case. We also provide query complexity lower bounds for both scenarios, which match the upper bounds up to logarithmic factors, and validate our theoretical results with Qiskit-based simulations.

Published

2025-04-11

How to Cite

Wang, X., Chen, Y.-Z. J., Guedes de Andrade, M., Allcock, J., Hajiesmaili, M., Lui, J. C., & Towsley, D. (2025). Quantum Best Arm Identification with Quantum Oracles. Proceedings of the AAAI Conference on Artificial Intelligence, 39(20), 21321–21329. https://doi.org/10.1609/aaai.v39i20.35432

Issue

Section

AAAI Technical Track on Machine Learning VI