Korda, N., P. L.A., and R. Munos. “Fast Gradient Descent for Drifting Least Squares Regression, With Application to Bandits”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, Feb. 2015, doi:10.1609/aaai.v29i1.9619.