Chen, G., S. C. Liew, and D. Gündüz. “GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-Armed Bandits”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 40, no. 24, Mar. 2026, pp. 20032-40, doi:10.1609/aaai.v40i24.39088.