Yang, Shangdong, Yang Gao, Bo An, Hao Wang, and Xingguo Chen. “Efficient Average Reward Reinforcement Learning Using Constant Shifting Values”. Proceedings of the AAAI Conference on Artificial Intelligence 30, no. 1 (March 2, 2016). Accessed May 29, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/10285.