[1]
X. Wu, Y. Xie, S. S. Du, and R. Ward, “AdaLoss: A Computationally-Efficient and Provably Convergent Adaptive Gradient Method”, AAAI, vol. 36, no. 8, pp. 8691–8699, Jun. 2022.