Stability and Generalization of Decentralized Stochastic Gradient Descent

Authors

  • Tao Sun College of Computer, National University of Defense Technology
  • Dongsheng Li College of Computer, National University of Defense Technology
  • Bao Wang Scientific Computing & Imaging Institute, University of Utah

DOI:

https://doi.org/10.1609/aaai.v35i11.17173

Keywords:

Learning Theory, Optimization, Stochastic Optimization

Abstract

The stability and generalization of stochastic gradient-based methods provide valuable insights into understanding the algorithmic performance of machine learning models. As the main workhorse for deep learning, the stochastic gradient descent has received a considerable amount of studies. Nevertheless, the community paid little attention to its decentralized variants. In this paper, we provide a novel formulation of the decentralized stochastic gradient descent. Leveraging this formulation together with (non)convex optimization theory, we establish the first stability and generalization guarantees for the decentralized stochastic gradient descent. Our theoretical results are built on top of a few common and mild assumptions and reveal that the decentralization deteriorates the stability of SGD for the first time. We verify our theoretical findings by using a variety of decentralized settings and benchmark machine learning models.

Downloads

Published

2021-05-18

How to Cite

Sun, T., Li, D., & Wang, B. (2021). Stability and Generalization of Decentralized Stochastic Gradient Descent. Proceedings of the AAAI Conference on Artificial Intelligence, 35(11), 9756-9764. https://doi.org/10.1609/aaai.v35i11.17173

Issue

Section

AAAI Technical Track on Machine Learning IV