FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-Aware Model Update

Authors

  • Ji Liu Hithink RoyalFlush Information Network Co., Ltd., China
  • Juncheng Jia Soochow University, China Collaborative Innovation Center of Novel Software Technology and Industrialization, China
  • Tianshi Che Auburn University, United States
  • Chao Huo Soochow University, China
  • Jiaxiang Ren Auburn University, United States
  • Yang Zhou Auburn University, United States
  • Huaiyu Dai North Carolina State University, United States
  • Dejing Dou Boston Consulting Group, China

DOI:

https://doi.org/10.1609/aaai.v38i12.29297

Keywords:

ML: Distributed Machine Learning & Federated Learning, DMKM: Scalability, Parallel & Distributed Systems, ML: Scalability of ML Systems

Abstract

As a promising approach to deal with distributed data, Federated Learning (FL) achieves major advancements in recent years. FL enables collaborative model training by exploiting the raw data dispersed in multiple edge devices. However, the data is generally non-independent and identically distributed, i.e., statistical heterogeneity, and the edge devices significantly differ in terms of both computation and communication capacity, i.e., system heterogeneity. The statistical heterogeneity leads to severe accuracy degradation while the system heterogeneity significantly prolongs the training process. In order to address the heterogeneity issue, we propose an Asynchronous Staleness-aware Model Update FL framework, i.e., FedASMU, with two novel methods. First, we propose an asynchronous FL system model with a dynamical model aggregation method between updated local models and the global model on the server for superior accuracy and high efficiency. Then, we propose an adaptive local model adjustment method by aggregating the fresh global model with local models on devices to further improve the accuracy. Extensive experimentation with 6 models and 5 public datasets demonstrates that FedASMU significantly outperforms baseline approaches in terms of accuracy (0.60% to 23.90% higher) and efficiency (3.54% to 97.98% faster).

Published

2024-03-24

How to Cite

Liu, J., Jia, J., Che, T., Huo, C., Ren, J., Zhou, Y., Dai, H., & Dou, D. (2024). FedASMU: Efficient Asynchronous Federated Learning with Dynamic Staleness-Aware Model Update. Proceedings of the AAAI Conference on Artificial Intelligence, 38(12), 13900-13908. https://doi.org/10.1609/aaai.v38i12.29297

Issue

Section

AAAI Technical Track on Machine Learning III