SpreadGNN: Decentralized Multi-Task Federated Learning for Graph Neural Networks on Molecular Data


  • Chaoyang He University of Southern California
  • Emir Ceyani University of Southern California
  • Keshav Balasubramanian University of Southern California
  • Murali Annavaram University of Southern California
  • Salman Avestimehr University of Southern California




Machine Learning (ML)


Graph Neural Networks (GNNs) are the first choice methods for graph machine learning problems thanks to their ability to learn state-of-the-art level representations from graph-structured data. However, centralizing a massive amount of real-world graph data for GNN training is prohibitive due to user-side privacy concerns, regulation restrictions, and commercial competition. Federated Learning is the de-facto standard for collaborative training of machine learning models over many distributed edge devices without the need for centralization. Nevertheless, training graph neural networks in a federated setting is vaguely defined and brings statistical and systems challenges. This work proposes SpreadGNN, a novel multi-task federated training framework capable of operating in the presence of partial labels and absence of a central server for the first time in the literature. We provide convergence guarantees and empirically demonstrate the efficacy of our framework on a variety of non-I.I.D. distributed graph-level molecular property prediction datasets with partial labels. Our results show that SpreadGNN outperforms GNN models trained over a central server-dependent federated learning system, even in constrained topologies.




How to Cite

He, C., Ceyani, E., Balasubramanian, K., Annavaram, M., & Avestimehr, S. (2022). SpreadGNN: Decentralized Multi-Task Federated Learning for Graph Neural Networks on Molecular Data. Proceedings of the AAAI Conference on Artificial Intelligence, 36(6), 6865-6873. https://doi.org/10.1609/aaai.v36i6.20643



AAAI Technical Track on Machine Learning I