[1]

Chen, G., Liew, S.C. and Gündüz, D. 2026. GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits. Proceedings of the AAAI Conference on Artificial Intelligence. 40, 24 (Mar. 2026), 20032-20040. DOI:https://doi.org/10.1609/aaai.v40i24.39088.