[1]
L. Yang, “Policy Optimization with Stochastic Mirror Descent”, AAAI, vol. 36, no. 8, pp. 8823-8831, Jun. 2022.