Chen, G., Liew, S. C., & Gündüz, D. (2026). GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits. Proceedings of the AAAI Conference on Artificial Intelligence, 40(24), 20032-20040. https://doi.org/10.1609/aaai.v40i24.39088