Anagnostides, I. (2024) “Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property”, Proceedings of the AAAI Conference on Artificial Intelligence, 38(9), pp. 9451–9459. doi: 10.1609/aaai.v38i9.28799.