XIONG, Guojun; LI, Jian; SINGH, Rahul. Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, [S. l.], v. 36, n. 8, p. 8726–8734, 2022. DOI: 10.1609/aaai.v36i8.20852. Disponível em: https://ojs.aaai.org/index.php/AAAI/article/view/20852. Acesso em: 28 may. 2026.