Le, H., M. Abdolshah, T. K. George, K. Do, D. Nguyen, and S. Venkatesh. “Episodic Policy Gradient Training”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, June 2022, pp. 7317-25, doi:10.1609/aaai.v36i7.20694.