[1]

Chang, J.Q.L. and Tan, V.Y.F. 2022. A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits. Proceedings of the AAAI Conference on Artificial Intelligence. 36, 6 (Jun. 2022), 6159-6166. DOI:https://doi.org/10.1609/aaai.v36i6.20564.