[1]

Y. Narita, S. Yasui, and K. Yata, “Efficient Counterfactual Learning from Bandit Feedback”, AAAI, vol. 33, no. 01, pp. 4634–4641, Jul. 2019.