Xiong, Guojun, et al. “Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 8, June 2022, pp. 8726-34, doi:10.1609/aaai.v36i8.20852.