[1]

Y.-H. Hung and P.-C. Hsieh, “Reward-Biased Maximum Likelihood Estimation for Neural Contextual Bandits: A Distributional Learning Perspective”, AAAI, vol. 37, no. 7, pp. 7944-7952, Jun. 2023.