YANG, Shangdong; GAO, Yang; AN, Bo; WANG, Hao; CHEN, Xingguo. Efficient Average Reward Reinforcement Learning Using Constant Shifting Values. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 30, n. 1, 2016. DOI: 10.1609/aaai.v30i1.10285. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/10285. Acesso em: 29 may. 2026.