Is Your Data Relevant?: Dynamic Selection of Relevant Data for Federated Learning

Authors

  • Lokesh Nagalapatti IIT Bombay
  • Ruhi Sharma Mittal IBM Research AI
  • Ramasuri Narayanam Adobe Research India

DOI:

https://doi.org/10.1609/aaai.v36i7.20755

Keywords:

Machine Learning (ML), Data Mining & Knowledge Management (DMKM)

Abstract

Federated Learning (FL) is a machine learning paradigm in which multiple clients participate to collectively learn a global machine learning model at the central server. It is plausible that not all the data owned by each client is relevant to the server's learning objective. The updates incorporated from irrelevant data could be detrimental to the global model. The task of selecting relevant data is explored in traditional machine learning settings where the assumption is that all the data is available in one place. In FL settings, the data is distributed across multiple clients and the server can't introspect it. This precludes the application of traditional solutions to selecting relevant data here. In this paper, we propose an approach called Federated Learning with Relevant Data (FLRD), that facilitates clients to derive updates using relevant data. Each client learns a model called Relevant Data Selector (RDS) that is private to itself to do the selection. This in turn helps in building an effective global model. We perform experiments with multiple real-world datasets to demonstrate the efficacy of our solution. The results show (a) the capability of FLRD to identify relevant data samples at each client locally and (b) the superiority of the global model learned by FLRD over other baseline algorithms.

Downloads

Published

2022-06-28

How to Cite

Nagalapatti, L., Mittal, R. S., & Narayanam, R. (2022). Is Your Data Relevant?: Dynamic Selection of Relevant Data for Federated Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 36(7), 7859-7867. https://doi.org/10.1609/aaai.v36i7.20755

Issue

Section

AAAI Technical Track on Machine Learning II