Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning (Student Abstract)

Authors

  • Anaelia Ovalle University of California, Los Angeles
  • Evan Czyzycki University of California, Los Angeles
  • Cho-Jui Hsieh University of California, Los Angeles

DOI:

https://doi.org/10.1609/aaai.v37i13.27006

Keywords:

Adversarial Robustness, AI Safety, Metric Learning

Abstract

Intentionally crafted adversarial samples have effectively exploited weaknesses in deep neural networks. A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample such that its corresponding model output changes. These sensitivity attacks exploit the model's sensitivity toward task-irrelevant features. Another form of adversarial sample can be crafted via invariance attacks, which exploit the model underestimating the importance of relevant features. Previous literature has indicated a tradeoff in defending against both attack types within a strictly L-p bounded defense. To promote robustness toward both types of attacks beyond Euclidean distance metrics, we use metric learning to frame adversarial regularization as an optimal transport problem. Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.

Downloads

Published

2023-09-06

How to Cite

Ovalle, A., Czyzycki, E., & Hsieh, C.-J. (2023). Improving Adversarial Robustness to Sensitivity and Invariance Attacks with Deep Metric Learning (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 16292-16293. https://doi.org/10.1609/aaai.v37i13.27006