Stop Diverse OOD Attacks: Knowledge Ensemble for Reliable Defense

Authors

  • Zhenbo Shi University of Science and Technology of China, School of Computer Science and Technology, Hefei, China University of Science and Technology of China, Suzhou Institute for Advanced Research, Suzhou, China Laboratory for Advanced Computing and Intelligence Engineering, Wuxi, China University of Science and Technology of China, Hefei National Laboratory, Hefei, China
  • Xiaoman Liu University of Science and Technology of China, School of Computer Science and Technology, Hefei, China University of Science and Technology of China, Suzhou Institute for Advanced Research, Suzhou, China
  • Yuxuan Zhang University of Science and Technology of China, School of Computer Science and Technology, Hefei, China
  • Shuchang Wang University of Science and Technology of China, School of Computer Science and Technology, Hefei, China
  • Rui Shu University of Science and Technology of China, School of Computer Science and Technology, Hefei, China
  • Zhidong Yu University of Science and Technology of China, School of Computer Science and Technology, Hefei, China University of Science and Technology of China, Hefei National Laboratory, Hefei, China
  • Wei Yang University of Science and Technology of China, School of Computer Science and Technology, Hefei, China University of Science and Technology of China, Suzhou Institute for Advanced Research, Suzhou, China University of Science and Technology of China, Hefei National Laboratory, Hefei, China
  • Liusheng Huang University of Science and Technology of China, School of Computer Science and Technology, Hefei, China University of Science and Technology of China, Suzhou Institute for Advanced Research, Suzhou, China

DOI:

https://doi.org/10.1609/aaai.v39i19.34251

Abstract

Enhancing defense through model ensemble is an emerging trend, where the challenge lies in how to use ensemble knowledge to counter Out-of-Distribution (OOD) attacks. In this paper, we propose the Reliable Defense Ensemble (REE) to address this issue. REE optimizes the ensemble knowledge of models through aggregation and enhances multidimensional robust performance through collaboration. It employs the Dynamic Synergy Amplification for weight allocation and strategy adjustment. Furthermore, we design a new Kernel Anomaly Smoothing Detection Module, which detects anomalous attacks using a smoothing feature function based on Gaussian kernel mean embedding and a multi-layer feedback structure. Particularly, we build a framework that uses reinforcement learning to iteratively fine-tune the parameters of inter-model communication and consensus. Extensive experimental results show that REE outperforms current state-of-the-art methods by a large margin in defending against OOD attacks.

Downloads

Published

2025-04-11

How to Cite

Shi, Z., Liu, X., Zhang, Y., Wang, S., Shu, R., Yu, Z., … Huang, L. (2025). Stop Diverse OOD Attacks: Knowledge Ensemble for Reliable Defense. Proceedings of the AAAI Conference on Artificial Intelligence, 39(19), 20436–20444. https://doi.org/10.1609/aaai.v39i19.34251

Issue

Section

AAAI Technical Track on Machine Learning V