[1]

Anagnostides, I. et al. 2024. Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property. Proceedings of the AAAI Conference on Artificial Intelligence. 38, 9 (Mar. 2024), 9451–9459. DOI:https://doi.org/10.1609/aaai.v38i9.28799.