Federated Causally Invariant Feature Learning

Authors

  • Xianjie Guo Hefei University of Technology, China Key Laboratory of Knowledge Engineering with Big Data of Ministry of Education, China
  • Kui Yu Hefei University of Technology, China Key Laboratory of Knowledge Engineering with Big Data of Ministry of Education, China
  • Lizhen Cui Shandong University, China
  • Han Yu Nanyang Technological University, Singapore
  • Xiaoxiao Li Nanyang Technological University, Singapore The University of British Columbia, Canada Vector Institute, Canada

DOI:

https://doi.org/10.1609/aaai.v39i16.33866

Abstract

Federated feature selection (FFS) is a promising field for selecting informative features while preserving data privacy in federated learning (FL) settings. Existing FFS methods focus on capturing the correlations between features and labels. They struggle to achieve satisfactory performance in the face of data distribution heterogeneity among FL clients, and cannot address the out-of-distribution (OOD) problem that arises when a significant portion of clients do not actively participate in FL training. To address these limitations, we propose Federated Causally Invariant Feature Learning (FedCIFL), a novel approach for learning causally invariant features in a privacy-preserving manner. We design a sample reweighting strategy to eliminate spurious correlations introduced by selection bias and iteratively estimate the federated causal effect between each feature and the labels (with the remaining features initially treated as confounders). By iteratively refining the confounding feature set to identify the true confounders, FedCIFL mitigates the impact of limited local data on the accuracy of federated causal effect estimation. Theoretical analysis proves the correctness of FedCIFL under reasonable assumptions. Extensive experiments on synthetic and real-world datasets demonstrate the superiority of FedCIFL against eight state-of-the-art baselines, beating the best-performing approach by 3.19%, 9.07% and 2.65% in terms of average test Accuracy, RMSE and F1 score, respectively. It is a first-of-its-kind FFS approach capable of handling Non-IID and OOD data simultaneously. The source code is available at https://github.com/Xianjie-Guo/FedCIFL.

Downloads

Published

2025-04-11

How to Cite

Guo, X., Yu, K., Cui, L., Yu, H., & Li, X. (2025). Federated Causally Invariant Feature Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16), 16978–16986. https://doi.org/10.1609/aaai.v39i16.33866

Issue

Section

AAAI Technical Track on Machine Learning II