Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology

Xiaolin Hu; Zixuan Gong; Gengze Xu; Wei Liu; Jian Luan; Bin Wang; Yong Liu

doi:10.1609/aaai.v39i16.33906

Authors

Xiaolin Hu Renmin University of China
Zixuan Gong Renmin University of China
Gengze Xu Renmin University of China
Wei Liu XiaoMi AI Lab
Jian Luan XiaoMi AI Lab
Bin Wang XiaoMi AI Lab
Yong Liu Renmin University of China

DOI:

https://doi.org/10.1609/aaai.v39i16.33906

Abstract

Zeroth-order (ZO) optimization as the gradient-free method has become a powerful tool when the first-order gradient is unavailable or expensive to obtain, especially in decentralized learning scenarios where data and computational resources are distributed across multiple clients. There have been many efforts to analyze the optimization convergence rate of zeroth-order decentralized stochastic gradient descent (ZO-DSGD) algorithms. However, the generalization of these methods has not been well studied. In this paper, we provide a generalization analysis of ZO-DSGD with changing topology, where the clients run zeroth-order SGD with local data and communicate with each other according to time-varying topology. We systematically analyze the generalization error in convex, strongly convex, and non-convex cases. The obtained results in the convex and strongly convex cases with zeroth-order oracles recover the results of SGD. Moreover, the generalization bounds derived in non-convex cases align with that of DSGD. To capture the influence of communication topology on the generalization performance, we analyze local generalization bounds concerning local models held at different clients. The obtained results reflect the influence of the number of clients, local sample size, and topology on the generalization error. To the best of our knowledge, this is the first work that provides a generalization analysis of zeroth-order decentralized stochastic gradient descent methods and recovers the results of SGD.

Stability and Generalization of Zeroth-Order Decentralized Stochastic Gradient Descent with Changing Topology

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information