Le, Hung, Majid Abdolshah, Thommen K. George, Kien Do, Dung Nguyen, and Svetha Venkatesh. 2022. “Episodic Policy Gradient Training”. Proceedings of the AAAI Conference on Artificial Intelligence 36 (7):7317-25. https://doi.org/10.1609/aaai.v36i7.20694.