Federated Latent Dirichlet Allocation: A Local Differential Privacy Based Framework

Yansheng Wang; Yongxin Tong; Dingyuan Shi

doi:10.1609/aaai.v34i04.6096

Authors

Yansheng Wang Beihang University
Yongxin Tong Beihang University
Dingyuan Shi Beihang University

DOI:

https://doi.org/10.1609/aaai.v34i04.6096

Abstract

Latent Dirichlet Allocation (LDA) is a widely adopted topic model for industrial-grade text mining applications. However, its performance heavily relies on the collection of large amount of text data from users' everyday life for model training. Such data collection risks severe privacy leakage if the data collector is untrustworthy. To protect text data privacy while allowing accurate model training, we investigate federated learning of LDA models. That is, the model is collaboratively trained between an untrustworthy data collector and multiple users, where raw text data of each user are stored locally and not uploaded to the data collector. To this end, we propose FedLDA, a local differential privacy (LDP) based framework for federated learning of LDA models. Central in FedLDA is a novel LDP mechanism called Random Response with Priori (RRP), which provides theoretical guarantees on both data privacy and model accuracy. We also design techniques to reduce the communication cost between the data collector and the users during model training. Extensive experiments on three open datasets verified the effectiveness of our solution.

Federated Latent Dirichlet Allocation: A Local Differential Privacy Based Framework

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription