Deep (Predictive) Discounted Counterfactual Regret Minimization

Authors

  • Hang Xu C2DL, Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Kai Li C2DL, Institute of Automation, Chinese Academy of Sciences School of Artificial Intelligence, University of Chinese Academy of Sciences
  • Haobo Fu Tencent AI Lab
  • Qiang Fu Tencent AI Lab
  • Junliang Xing Tsinghua University
  • Jian Cheng C2DL, Institute of Automation, Chinese Academy of Sciences AiRiA Maicro.ai

DOI:

https://doi.org/10.1609/aaai.v40i20.38780

Abstract

Counterfactual regret minimization (CFR) is a family of algorithms for effectively solving imperfect-information games. To enhance CFR's applicability in large games, researchers use neural networks to approximate its behavior. However, existing methods are mainly based on vanilla CFR and struggle to effectively integrate more advanced CFR variants. In this work, we propose an efficient model-free neural CFR algorithm, overcoming the limitations of existing methods in approximating advanced CFR variants. At each iteration, it collects variance-reduced sampled advantages based on a value network, fits cumulative advantages by bootstrapping, and applies discounting and clipping operations to simulate the update mechanisms of advanced CFR variants. Experimental results show that, compared with model-free neural algorithms, it exhibits faster convergence in typical imperfect-information games and demonstrates stronger adversarial performance in a large poker game.

Published

2026-03-14

How to Cite

Xu, H., Li, K., Fu, H., Fu, Q., Xing, J., & Cheng, J. (2026). Deep (Predictive) Discounted Counterfactual Regret Minimization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(20), 17284–17292. https://doi.org/10.1609/aaai.v40i20.38780

Issue

Section

AAAI Technical Track on Game Theory and Economic Paradigms