Proximal Stochastic Recursive Momentum Methods for Nonconvex Composite Decentralized Optimization


  • Gabriel Mancino-Ball Rensselaer Polytechnic Institute
  • Shengnan Miao Rensselaer Polytechnic Institute
  • Yangyang Xu Rensselaer Polytechnic Institute
  • Jie Chen IBM Research



ML: Optimization, ML: Distributed Machine Learning & Federated Learning


Consider a network of N decentralized computing agents collaboratively solving a nonconvex stochastic composite problem. In this work, we propose a single-loop algorithm, called DEEPSTORM, that achieves optimal sample complexity for this setting. Unlike double-loop algorithms that require a large batch size to compute the (stochastic) gradient once in a while, DEEPSTORM uses a small batch size, creating advantages in occasions such as streaming data and online learning. This is the first method achieving optimal sample complexity for decentralized nonconvex stochastic composite problems, requiring O(1) batch size. We conduct convergence analysis for DEEPSTORM with both constant and diminishing step sizes. Additionally, under proper initialization and a small enough desired solution error, we show that DEEPSTORM with a constant step size achieves a network-independent sample complexity, with an additional linear speed-up with respect to N over centralized methods. All codes are made available at




How to Cite

Mancino-Ball, G., Miao, S., Xu, Y., & Chen, J. (2023). Proximal Stochastic Recursive Momentum Methods for Nonconvex Composite Decentralized Optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 37(7), 9055-9063.



AAAI Technical Track on Machine Learning II