Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning
DOI:
https://doi.org/10.1609/aaai.v38i13.29434Keywords:
ML: Distributed Machine Learning & Federated Learning, CV: ApplicationsAbstract
Federated learning (FL) enables multiple clients to collaboratively train a global model without disclosing their data. Previous researches often require training the complete model parameters. However, the emergence of powerful pre-trained models makes it possible to achieve higher performance with fewer learnable parameters in FL. In this paper, we propose a federated adaptive prompt tuning algorithm, FedAPT, for multi-domain collaborative image classification with powerful foundation models, like CLIP. Compared with direct federated prompt tuning, our core idea is to adaptively unlock specific domain knowledge for each test sample in order to provide them with personalized prompts. To implement this idea, we design an adaptive prompt tuning module, which consists of a meta prompt, an adaptive network, and some keys. The server randomly generates a set of keys and assigns a unique key to each client. Then all clients cooperatively train the global adaptive network and meta prompt with the local datasets and the frozen keys. Ultimately, the global aggregation model can assign a personalized prompt to CLIP based on the domain features of each test sample. We perform extensive experiments on two multi-domain image classification datasets across two different settings -- supervised and unsupervised. The results show that FedAPT can achieve better performance with less than 10% of the number of parameters of the fully trained model, and the global model can perform well in diverse client domains simultaneously.Downloads
Published
2024-03-24
How to Cite
Su, S., Yang, M., Li, B., & Xue, X. (2024). Federated Adaptive Prompt Tuning for Multi-Domain Collaborative Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(13), 15117-15125. https://doi.org/10.1609/aaai.v38i13.29434
Issue
Section
AAAI Technical Track on Machine Learning IV