Faster Game Solving via Hyperparameter Schedules

Authors

  • Naifeng Zhang Carnegie Mellon University
  • Stephen Marcus McAleer Anthropic
  • Tuomas Sandholm Carnegie Mellon University Strategy Robot, Inc. Strategic Machine, Inc. Optimized Markets, Inc.

DOI:

https://doi.org/10.1609/aaai.v40i20.38784

Abstract

Counterfactual regret minimization (CFR) algorithms are a foundational class of methods for solving imperfect-information games, with the time average of their iterates converging to a Nash equilibrium in two-player zero-sum games. Prior state-of-the-art variants, Discounted CFR (DCFR) and Predictive CFR+ (PCFR+), achieved the fastest known practical performance by improving convergence rates over vanilla CFR through discounting early iterations with a fixed discounting scheme. More recently, Dynamic DCFR (DDCFR) introduced agent-learned dynamic discounting schemes to further accelerate convergence, at the cost of increased complexity. To address this, we propose Hyperparameter Schedules (HSs), a remarkably simple, training-free framework that dynamically adjusts CFR discounting over time. HSs aggressively downweight early updates and gradually transition to trusting late-stage strategies, leading to substantially faster convergence with only a few lines of code modifications. We show that HSs derived from just three small extensive-form games generalize effectively to 17 diverse games (including large-scale realistic poker) in both extensive-form and normal-form settings, without any game-specific tuning. Our method establishes a new state of the art for solving two-player zero-sum games.

Published

2026-03-14

How to Cite

Zhang, N., McAleer, S. M., & Sandholm, T. (2026). Faster Game Solving via Hyperparameter Schedules. Proceedings of the AAAI Conference on Artificial Intelligence, 40(20), 17319–17326. https://doi.org/10.1609/aaai.v40i20.38784

Issue

Section

AAAI Technical Track on Game Theory and Economic Paradigms