Schmitt, S., Shawe-Taylor, J. and Hasselt, H. . . van (2022) “Chaining Value Functions for Off-Policy Learning”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(8), pp. 8187-8195. doi: 10.1609/aaai.v36i8.20792.