[1]
M. Omura, T. Osa, Y. Mukuta, and T. Harada, “Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning”, AAAI, vol. 38, no. 13, pp. 14474-14481, Mar. 2024.