Omura, Motoki, Takayuki Osa, Yusuke Mukuta, and Tatsuya Harada. “Symmetric Q-Learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning”. Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 13 (March 24, 2024): 14474-14481. Accessed September 1, 2024. https://ojs.aaai.org/index.php/AAAI/article/view/29362.