[1]

Z. Huo, B. Gu, and H. Huang, “Large Batch Optimization for Deep Learning Using New Complete Layer-Wise Adaptive Rate Scaling”, AAAI, vol. 35, no. 9, pp. 7883-7890, May 2021.