Stability-Based Generalization Analysis of the Asynchronous Decentralized SGD

Authors

  • Xiaoge Deng National University of Defense Technology
  • Tao Sun National University of Defense Technology
  • Shengwei Li National University of Defense Technology
  • Dongsheng Li National University of Defense Technology

DOI:

https://doi.org/10.1609/aaai.v37i6.25894

Keywords:

ML: Learning Theory, ML: Deep Learning Theory, ML: Distributed Machine Learning & Federated Learning, ML: Optimization

Abstract

The generalization ability often determines the success of machine learning algorithms in practice. Therefore, it is of great theoretical and practical importance to understand and bound the generalization error of machine learning algorithms. In this paper, we provide the first generalization results of the popular stochastic gradient descent (SGD) algorithm in the distributed asynchronous decentralized setting. Our analysis is based on the uniform stability tool, where stable means that the learned model does not change much in small variations of the training set. Under some mild assumptions, we perform a comprehensive generalizability analysis of the asynchronous decentralized SGD, including generalization error and excess generalization error bounds for the strongly convex, convex, and non-convex cases. Our theoretical results reveal the effects of the learning rate, training data size, training iterations, decentralized communication topology, and asynchronous delay on the generalization performance of the asynchronous decentralized SGD. We also study the optimization error regarding the objective function values and investigate how the initial point affects the excess generalization error. Finally, we conduct extensive experiments on MNIST, CIFAR-10, CIFAR-100, and Tiny-ImageNet datasets to validate the theoretical findings.

Downloads

Published

2023-06-26

How to Cite

Deng, X., Sun, T., Li, S., & Li, D. (2023). Stability-Based Generalization Analysis of the Asynchronous Decentralized SGD. Proceedings of the AAAI Conference on Artificial Intelligence, 37(6), 7340-7348. https://doi.org/10.1609/aaai.v37i6.25894

Issue

Section

AAAI Technical Track on Machine Learning I