Lupu, A., Durand, A., & Precup, D. (2019). Leveraging Observations in Bandits: Between Risks and Benefits. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 6112-6119. https://doi.org/10.1609/aaai.v33i01.33016112