Omura, M., T. Osa, Y. Mukuta, and T. Harada. “Symmetric Q-Learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 13, Mar. 2024, pp. 14474-81, doi:10.1609/aaai.v38i13.29362.