Le, H. (2022) “Episodic Policy Gradient Training”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), pp. 7317–7325. doi: 10.1609/aaai.v36i7.20694.