Good Gradients Poison Your Model: Evading Defenses in Federated Learning via Boundary-adaptive Perturbation

Authors

  • Xiaojie Zhao Beijing University of Posts and Telecommunications
  • Jinqiao Shi Beijing University of Posts and Telecommunications
  • Yi Li Beijing University of Posts and Telecommunications
  • Junmin Huang Beijing University of Posts and Telecommunications
  • Chongru Fan Beijing University of Posts and Telecommunications

DOI:

https://doi.org/10.1609/aaai.v40i16.38328

Abstract

Federated learning (FL) allows for collaborative model training while preserving data privacy, but its distributed nature makes it vulnerable to poisoning attacks. Existing defense methods typically rely on using gradients from multiple clients to define a trusted region, selecting only the trustworthy update (good gradients) within this region for aggregation. Mainstream defense boundaries are categorized as hard boundaries, soft boundaries, and semi-soft boundaries. However, we argue that even good gradients within these boundaries can still be exploited by attackers to poison the model. To tackle this challenge, we introduce a boundary-adaptive attack method that leverages the directional properties of optimization techniques to derive baseline poisoned gradients. Through iterative perturbation, it generates seemingly innocent gradients that subtly deviate from the global model. Our extensive study on benchmark datasets and mainstream defensive mechanisms confirms that the proposed attack raises a significantly threat to the integrity and security of FL practices, regardless of the flourishing of robust FL methods.

Published

2026-03-14

How to Cite

Zhao, X., Shi, J., Li, Y., Huang, J., & Fan, C. (2026). Good Gradients Poison Your Model: Evading Defenses in Federated Learning via Boundary-adaptive Perturbation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(16), 13262–13270. https://doi.org/10.1609/aaai.v40i16.38328

Issue

Section

AAAI Technical Track on Computer Vision XIII