Xiong, G., Li, J., & Singh, R. (2022). Reinforcement Learning Augmented Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8), 8726–8734. https://doi.org/10.1609/aaai.v36i8.20852