FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning

Authors

  • Haokun Chen LMU Munich Siemens AG
  • Yao Zhang LMU Munich Munich Center for Machine Learning (MCML)
  • Denis Krompass Siemens AG
  • Jindong Gu University of Oxford
  • Volker Tresp LMU Munich Munich Center for Machine Learning (MCML)

DOI:

https://doi.org/10.1609/aaai.v38i10.29007

Keywords:

ML: Multimodal Learning, CV: Language and Vision, ML: Distributed Machine Learning & Federated Learning

Abstract

Recently, foundation models have exhibited remarkable advancements in multi-modal learning. These models, equipped with millions (or billions) of parameters, typically require a substantial amount of data for finetuning. However, collecting and centralizing training data from diverse sectors becomes challenging due to distinct privacy regulations. Federated Learning (FL) emerges as a promising solution, enabling multiple clients to collaboratively train neural networks without centralizing their local data. To alleviate client computation burdens and communication overheads, previous works have adapted Parameter-efficient Finetuning (PEFT) methods for FL. Hereby, only a small fraction of the model parameters are optimized and communicated during federated communications. Nevertheless, most previous works have focused on a single modality and neglected one common phenomenon, i.e., the presence of data heterogeneity across the clients. Therefore, in this work, we propose a finetuning framework tailored to heterogeneous multi-modal FL, called Federated Dual-Aadapter Teacher (FedDAT). Specifically, our approach leverages a Dual-Adapter Teacher (DAT) to address data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer. FedDAT is the first approach that enables an efficient distributed finetuning of foundation models for a variety of heterogeneous Vision-Language tasks. To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity, where FedDAT substantially outperforms the existing centralized PEFT methods adapted for FL.

Published

2024-03-24

How to Cite

Chen, H., Zhang, Y., Krompass, D., Gu, J., & Tresp, V. (2024). FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11285-11293. https://doi.org/10.1609/aaai.v38i10.29007

Issue

Section

AAAI Technical Track on Machine Learning I