Bounding Regret in Empirical Games

Steven Jecmen; Arunesh Sinha; Zun Li; Long Tran-Thanh

doi:10.1609/aaai.v34i04.5851

Authors

Steven Jecmen Carnegie Mellon University
Arunesh Sinha Singapore Management University
Zun Li University of Michigan
Long Tran-Thanh University of Southampton

DOI:

https://doi.org/10.1609/aaai.v34i04.5851

Abstract

Empirical game-theoretic analysis refers to a set of models and techniques for solving large-scale games. However, there is a lack of a quantitative guarantee about the quality of output approximate Nash equilibria (NE). A natural quantitative guarantee for such an approximate NE is the regret in the game (i.e. the best deviation gain). We formulate this deviation gain computation as a multi-armed bandit problem, with a new optimization goal unlike those studied in prior work. We propose an efficient algorithm Super-Arm UCB (SAUCB) for the problem and a number of variants. We present sample complexity results as well as extensive experiments that show the better performance of SAUCB compared to several baselines.

Bounding Regret in Empirical Games

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription