Certified but Fooled! Breaking Certified Defenses with Ghost Certificates

Authors

  • Quoc Viet Vo University of Adelaide
  • Tashreque Mohammed Haq University of Adelaide
  • Paul Montague Defence Science and Technology Group
  • Tamas Abraham Defence Science and Technology Group
  • Ehsan Abbasnejad Monash University
  • Damith C. Ranasinghe University of Adelaide

DOI:

https://doi.org/10.1609/aaai.v40i12.37924

Abstract

Certified defenses promise provable robustness guarantees. We study the malicious exploitation of probabilistic certification frameworks to better understand the limits of guarantee provisions. Now, the objective is to not only mislead a classifier, but also to manipulate the certification process to generate a robustness guarantee for an adversarial input—certificate spoofing. A recent study in ICLR demonstrated that crafting large perturbations can shift inputs far into regions capable of generating a certificate for an incorrect class. Our study investigates if perturbations are needed to cause a misclassification and yet coax a certified model into issuing a deceptive, large robustness radius for a target class can still be made small and imperceptible. We explore the idea of region-focused adversarial examples to craft imperceptible perturbations, spoof certificates and achieve certification radii larger than the source class—ghost certificates. Extensive evaluations with the ImageNet demonstrate the ability to effectively bypass state-of-the-art certified defenses such as Densepure. Our work underscores the need to better understand the limits of robustness certification methods.

Downloads

Published

2026-03-14

How to Cite

Vo, Q. V., Haq, T. M., Montague, P., Abraham, T., Abbasnejad, E., & Ranasinghe, D. C. (2026). Certified but Fooled! Breaking Certified Defenses with Ghost Certificates. Proceedings of the AAAI Conference on Artificial Intelligence, 40(12), 9621–9629. https://doi.org/10.1609/aaai.v40i12.37924

Issue

Section

AAAI Technical Track on Computer Vision IX