Faster Double Adaptive Gradient Methods

Authors

  • Feihu Huang Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence
  • Yuning Luo Nanjing University of Aeronautics and Astronautics

DOI:

https://doi.org/10.1609/aaai.v39i25.34908

Abstract

In this paper, we propose a class of faster double adaptive gradient methods to solve nonconvex finite-sum optimization problems possibly with nonsmooth regularization by simultaneously using adaptive learning rate and adaptive mini-batch size. Specifically, we first propose a double adaptive stochastic gradient method (i.e., 2AdaSGD), and prove that our 2AdaSGD obtains a low stochastic first-order oracle (SFO) complexity for finding a stationary solution under the population smoothness condition. Furthermore, we propose a variance reduced double adaptive stochastic gradient method (i.e., 2AdaSPIDER), and prove that our 2AdaSPIDER obtains an optimal SFO complexity under the average smoothness condition, which is lower than the SFO complexity of the existing double adaptive gradient algorithms. In particular, we introduce a new stochastic gradient mapping to adaptively adjust mini-batch size in our stochastic gradient methods. We conduct some numerical experiments to verify efficiency of our proposed methods.

Downloads

Published

2025-04-11

How to Cite

Huang, F., & Luo, Y. (2025). Faster Double Adaptive Gradient Methods. Proceedings of the AAAI Conference on Artificial Intelligence, 39(25), 27018–27026. https://doi.org/10.1609/aaai.v39i25.34908

Issue

Section

AAAI Technical Track on Search and Optimization