[1]

H. Jiang, J. Xie, and J. Yang, “Action Candidate Based Clipped Double Q-learning for Discrete and Continuous Action Tasks”, AAAI, vol. 35, no. 9, pp. 7979–7986, May 2021.