Bounding Regret in Empirical Games

Authors

  • Steven Jecmen Carnegie Mellon University
  • Arunesh Sinha Singapore Management University
  • Zun Li University of Michigan
  • Long Tran-Thanh University of Southampton

DOI:

https://doi.org/10.1609/aaai.v34i04.5851

Abstract

Empirical game-theoretic analysis refers to a set of models and techniques for solving large-scale games. However, there is a lack of a quantitative guarantee about the quality of output approximate Nash equilibria (NE). A natural quantitative guarantee for such an approximate NE is the regret in the game (i.e. the best deviation gain). We formulate this deviation gain computation as a multi-armed bandit problem, with a new optimization goal unlike those studied in prior work. We propose an efficient algorithm Super-Arm UCB (SAUCB) for the problem and a number of variants. We present sample complexity results as well as extensive experiments that show the better performance of SAUCB compared to several baselines.

Downloads

Published

2020-04-03

How to Cite

Jecmen, S., Sinha, A., Li, Z., & Tran-Thanh, L. (2020). Bounding Regret in Empirical Games. Proceedings of the AAAI Conference on Artificial Intelligence, 34(04), 4280-4287. https://doi.org/10.1609/aaai.v34i04.5851

Issue

Section

AAAI Technical Track: Machine Learning