Anagnostides, I., Panageas, I., Farina, G., & Sandholm, T. (2024). Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property. Proceedings of the AAAI Conference on Artificial Intelligence, 38(9), 9451–9459. https://doi.org/10.1609/aaai.v38i9.28799