Lu, S., Zhang, K., Chen, T., Başar, T., & Horesh, L. (2021). Decentralized Policy Gradient Descent Ascent for Safe Multi-Agent Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 35(10), 8767-8775. https://doi.org/10.1609/aaai.v35i10.17062