[1]
Y. Kuang, M. Lu, J. Wang, Q. Zhou, B. Li, and H. Li, “Learning Robust Policy against Disturbance in Transition Dynamics via State-Conservative Policy Optimization”, AAAI, vol. 36, no. 7, pp. 7247-7254, Jun. 2022.