Wu, Xiaoxia, Yuege Xie, Simon Shaolei Du, and Rachel Ward. 2022. “AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method”. Proceedings of the AAAI Conference on Artificial Intelligence 36 (8):8691-99. https://doi.org/10.1609/aaai.v36i8.20848.