Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval

Tianlong Zhang; Zhe Xue; Adnan Mahmood; Junping Du; Yuchen Dong; Shilong Ou; Lang Feng; Ming-Hsuan Yang; Yuankai Qi

doi:10.1609/aaai.v39i21.34415

Authors

Tianlong Zhang Beijing University of Posts and Telecommunications
Zhe Xue Beijing University of Posts and Telecommunications
Adnan Mahmood Macquarie University
Junping Du Beijing University of Posts and Telecommunications
Yuchen Dong Beijing University of Posts and Telecommunications
Shilong Ou Beijing University of Posts and Telecommunications
Lang Feng Beijing University of Posts and Telecommunications
Ming-Hsuan Yang University of California at Merced
Yuankai Qi Macquarie University

DOI:

https://doi.org/10.1609/aaai.v39i21.34415

Abstract

Unsupervised federated learning for cross-modal retrieval has received increasing attention in recent years as it can free the requirement for annotations and avoid uploading original clients’ data to servers. Most existing methods focus on how to learn better local models and their aggregation to overcome data distribution drift across clients. Unlike prior works, we propose to address the data distribution problem by generating synthetic data, which can benefit existing federated learning methods. Specifically, we train a WGAN generator with three newly designed loss constraints on each client to improve the quality of the generated data. We first compute cluster prototypes to address the problem of lack of labels. Then, a direct contrastive loss between generated image and text features, an indirect contrastive loss with reference to cluster prototypes, and a Jensen-Shannon Divergence (JSD) loss also with reference to cluster prototypes work together to constrain the WGAN. The locally trained generators and local prototypes are sent to the server to generate and filter synthetic data with consideration of data distribution across all clients. The filtered data are used to train the aggregated global retrieval model, which is later sent to clients. The final global model becomes robust to all clients after several rounds of client-server iteration. Extensive experiments using four baselines across three datasets demonstrate that our method performs favourably against state-of-the-art methods.

Generating Synthetic Data for Unsupervised Federated Learning of Cross-Modal Retrieval

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information