Faster Double Adaptive Gradient Methods

Feihu Huang; Yuning Luo

doi:10.1609/aaai.v39i25.34908

Authors

Feihu Huang Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
Yuning Luo Nanjing University of Aeronautics and Astronautics

DOI:

https://doi.org/10.1609/aaai.v39i25.34908

Abstract

In this paper, we propose a class of faster double adaptive gradient methods to solve nonconvex finite-sum optimization problems possibly with nonsmooth regularization by simultaneously using adaptive learning rate and adaptive mini-batch size. Specifically, we first propose a double adaptive stochastic gradient method (i.e., 2AdaSGD), and prove that our 2AdaSGD obtains a low stochastic first-order oracle (SFO) complexity for finding a stationary solution under the population smoothness condition. Furthermore, we propose a variance reduced double adaptive stochastic gradient method (i.e., 2AdaSPIDER), and prove that our 2AdaSPIDER obtains an optimal SFO complexity under the average smoothness condition, which is lower than the SFO complexity of the existing double adaptive gradient algorithms. In particular, we introduce a new stochastic gradient mapping to adaptively adjust mini-batch size in our stochastic gradient methods. We conduct some numerical experiments to verify efficiency of our proposed methods.

Faster Double Adaptive Gradient Methods

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information