Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent
DOI:
https://doi.org/10.1609/aaai.v38i14.29477Keywords:
ML: Other Foundations of Machine Learning, ML: Evaluation and AnalysisAbstract
Decentralized Stochastic Gradient Descent (D-SGD) represents an efficient communication approach tailored for mastering insights from vast, distributed datasets. Inspired by parallel optimization paradigms, the incorporation of minibatch serves to diminish variance, consequently expediting the optimization process. Nevertheless, as per our current understanding, the existing literature has not thoroughly explored the learning theory foundation of Decentralized Minibatch Stochastic Gradient Descent (DM-SGD). In this paper, we try to address this theoretical gap by investigating the generalization properties of DM-SGD. We establish the sharper generalization bounds for the DM-SGD algorithm with replacement (without replacement) on (non)convex and (non)smooth cases. Moreover, our results consistently recover to the results of Centralized Stochastic Gradient Descent (C-SGD). In addition, we derive generalization analysis for Zero-Order (ZO) version of DM-SGD.Downloads
Published
2024-03-24
How to Cite
Wang, J., & Chen, H. (2024). Towards Stability and Generalization Bounds in Decentralized Minibatch Stochastic Gradient Descent. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15511-15519. https://doi.org/10.1609/aaai.v38i14.29477
Issue
Section
AAAI Technical Track on Machine Learning V