Regret Transfer and Parameter Optimization


  • Noam Brown Carnegie Mellon University
  • Tuomas Sandholm Carnegie Mellon University



Incomplete-Information Games, Poker, Game Solving, No-Regret Learning, Counterfactual Regret Minimization, Regret Matching, Regret Minimization


Regret matching is a widely-used algorithm for learning how to act. We begin by proving that regrets on actions in one setting (game) can be transferred to warm start the regrets for solving a different setting with same structure but different payoffs that can be written as a function of parameters. We prove how this can be done by carefully discounting the prior regrets. This provides, to our knowledge, the first principled warm-starting method for no-regret learning. It also extends to warm-starting the widely-adopted counterfactual regret minimization (CFR) algorithm for large incomplete-information games; we show this experimentally as well. We then study optimizing a parameter vector for a player in a two-player zero-sum game (e.g., optimizing bet sizes to use in poker). We propose a custom gradient descent algorithm that provably finds a locally optimal parameter vector while leveraging our warm-start theory to significantly save regret-matching iterations at each step. It optimizes the parameter vector while simultaneously finding an equilibrium. We present experiments in no-limit Leduc Hold'em and no-limit Texas Hold'em to optimize bet sizing. This amounts to the first action abstraction algorithm (algorithm for selecting a small number of discrete actions to use from a continuum of actions---a key preprocessing step for solving large games using current equilibrium-finding algorithms) with convergence guarantees for extensive-form games.




How to Cite

Brown, N., & Sandholm, T. (2014). Regret Transfer and Parameter Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1).



AAAI Technical Track: Game Theory and Economic Paradigms