Good Gradients Poison Your Model: Evading Defenses in Federated Learning via Boundary-adaptive Perturbation
DOI:
https://doi.org/10.1609/aaai.v40i16.38328Abstract
Federated learning (FL) allows for collaborative model training while preserving data privacy, but its distributed nature makes it vulnerable to poisoning attacks. Existing defense methods typically rely on using gradients from multiple clients to define a trusted region, selecting only the trustworthy update (good gradients) within this region for aggregation. Mainstream defense boundaries are categorized as hard boundaries, soft boundaries, and semi-soft boundaries. However, we argue that even good gradients within these boundaries can still be exploited by attackers to poison the model. To tackle this challenge, we introduce a boundary-adaptive attack method that leverages the directional properties of optimization techniques to derive baseline poisoned gradients. Through iterative perturbation, it generates seemingly innocent gradients that subtly deviate from the global model. Our extensive study on benchmark datasets and mainstream defensive mechanisms confirms that the proposed attack raises a significantly threat to the integrity and security of FL practices, regardless of the flourishing of robust FL methods.Downloads
Published
2026-03-14
How to Cite
Zhao, X., Shi, J., Li, Y., Huang, J., & Fan, C. (2026). Good Gradients Poison Your Model: Evading Defenses in Federated Learning via Boundary-adaptive Perturbation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(16), 13262–13270. https://doi.org/10.1609/aaai.v40i16.38328
Issue
Section
AAAI Technical Track on Computer Vision XIII