Qin, Y., Li, Y., Pasqualetti, F., Fazel, M., & Oymak, S. (2023). Stochastic Contextual Bandits with Long Horizon Rewards. Proceedings of the AAAI Conference on Artificial Intelligence, 37(8), 9525–9533. https://doi.org/10.1609/aaai.v37i8.26140