(1)

Gelada, C.; Bellemare, M. G. Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift. AAAI 2019, 33, 3647-3655.