Solving Games with Functional Regret Estimation

Kevin Waugh; Dustin Morrill; James Bagnell; Michael Bowling

doi:10.1609/aaai.v29i1.9445

Authors

Kevin Waugh Carnegie Mellon University
Dustin Morrill University of Alberta
James Bagnell Carnegie Mellon University
Michael Bowling University of Alberta

DOI:

https://doi.org/10.1609/aaai.v29i1.9445

Keywords:

Extensive-form Games, Abstraction, Nash Equilibrium, Regret, Counterfactual Regret

Abstract

We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no-regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self-play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work onabstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.

Solving Games with Functional Regret Estimation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription