ANAGNOSTIDES, Ioannis; PANAGEAS, Ioannis; FARINA, Gabriele; SANDHOLM, Tuomas. Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 38, n. 9, p. 9451–9459, 2024. DOI: 10.1609/aaai.v38i9.28799. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/28799. Acesso em: 30 may. 2026.