Wu, X. (2022) “AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method”, Proceedings of the AAAI Conference on Artificial Intelligence, 36(8), pp. 8691–8699. doi: 10.1609/aaai.v36i8.20848.