1.
Levy D, Ermon S. Deterministic Policy Optimization by Combining Pathwise and Score Function Estimators for Discrete Action Spaces. AAAI [Internet]. 2018Apr.29 [cited 2024Apr.12];32(1). Available from: https://ojs.aaai.org/index.php/AAAI/article/view/11822