Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval

Authors

  • Tianlong Zhang Beijing University of Posts and Telecommunications
  • Zhe Xue Beijing University of Posts and Telecommunications
  • Adnan Mahmood Macquarie University
  • Junping Du Beijing University of Posts and Telecommunications
  • Yuchen Dong Beijing University of Posts and Telecommunications
  • Shilong Ou Beijing University of Posts and Telecommunications
  • Lang Feng Beijing University of Posts and Telecommunications
  • Ming-Hsuan Yang University of California at Merced
  • Yuankai Qi Macquarie University

DOI:

https://doi.org/10.1609/aaai.v39i21.34415

Abstract

Unsupervised federated learning for cross-modal retrieval has received increasing attention in recent years as it can free the requirement for annotations and avoid uploading original clients’ data to servers. Most existing methods focus on how to learn better local models and their aggregation to overcome data distribution drift across clients. Unlike prior works, we propose to address the data distribution problem by generating synthetic data, which can benefit existing federated learning methods. Specifically, we train a WGAN generator with three newly designed loss constraints on each client to improve the quality of the generated data. We first compute cluster prototypes to address the problem of lack of labels. Then, a direct contrastive loss between generated image and text features, an indirect contrastive loss with reference to cluster prototypes, and a Jensen-Shannon Divergence (JSD) loss also with reference to cluster prototypes work together to constrain the WGAN. The locally trained generators and local prototypes are sent to the server to generate and filter synthetic data with consideration of data distribution across all clients. The filtered data are used to train the aggregated global retrieval model, which is later sent to clients. The final global model becomes robust to all clients after several rounds of client-server iteration. Extensive experiments using four baselines across three datasets demonstrate that our method performs favourably against state-of-the-art methods.

Downloads

Published

2025-04-11

How to Cite

Zhang, T., Xue, Z., Mahmood, A., Du, J., Dong, Y., Ou, S., … Qi, Y. (2025). Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval. Proceedings of the AAAI Conference on Artificial Intelligence, 39(21), 22569–22577. https://doi.org/10.1609/aaai.v39i21.34415

Issue

Section

AAAI Technical Track on Machine Learning VII