Proximal Stochastic Recursive Momentum Methods for Nonconvex Composite Decentralized Optimization

Authors

  • Gabriel Mancino-Ball Rensselaer Polytechnic Institute
  • Shengnan Miao Rensselaer Polytechnic Institute
  • Yangyang Xu Rensselaer Polytechnic Institute
  • Jie Chen IBM Research

DOI:

https://doi.org/10.1609/aaai.v37i7.26087

Keywords:

ML: Optimization, ML: Distributed Machine Learning & Federated Learning

Abstract

Consider a network of N decentralized computing agents collaboratively solving a nonconvex stochastic composite problem. In this work, we propose a single-loop algorithm, called DEEPSTORM, that achieves optimal sample complexity for this setting. Unlike double-loop algorithms that require a large batch size to compute the (stochastic) gradient once in a while, DEEPSTORM uses a small batch size, creating advantages in occasions such as streaming data and online learning. This is the first method achieving optimal sample complexity for decentralized nonconvex stochastic composite problems, requiring O(1) batch size. We conduct convergence analysis for DEEPSTORM with both constant and diminishing step sizes. Additionally, under proper initialization and a small enough desired solution error, we show that DEEPSTORM with a constant step size achieves a network-independent sample complexity, with an additional linear speed-up with respect to N over centralized methods. All codes are made available at https://github.com/gmancino/DEEPSTORM.

Downloads

Published

2023-06-26

How to Cite

Mancino-Ball, G., Miao, S., Xu, Y., & Chen, J. (2023). Proximal Stochastic Recursive Momentum Methods for Nonconvex Composite Decentralized Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 9055-9063. https://doi.org/10.1609/aaai.v37i7.26087

Issue

Section

AAAI Technical Track on Machine Learning II