Using Random Perturbations to Mitigate Adversarial Attacks on NLP Models

Abigail Swenor

doi:10.1609/aaai.v36i11.21707

Using Random Perturbations to Mitigate Adversarial Attacks on NLP Models

Authors

Abigail Swenor University of Colorado Colorado Springs

DOI:

https://doi.org/10.1609/aaai.v36i11.21707

Keywords:

Natural Language Processing, Adversarial Attacks, Sentiment Analysis

Abstract

Deep learning models have excelled in solving many problems in Natural Language Processing, but are susceptible to extensive vulnerabilities. We offer a solution to this vulnerability by using random perturbations such as spelling correction, synonym substitution, or dropping the word. These perturbations are applied to random words in random sentences to defend NLP models against adversarial attacks. Our defense methods are successful in returning attacked models to their original accuracy within statistical significance.

Downloads

Published

2022-06-28

How to Cite

Swenor, A. (2022). Using Random Perturbations to Mitigate Adversarial Attacks on NLP Models. Proceedings of the AAAI Conference on Artificial Intelligence, 36(11), 13142-13143. https://doi.org/10.1609/aaai.v36i11.21707

Download Citation

Issue

Vol. 36 No. 11: IAAI-22, EAAI-22, AAAI-22 Special Programs and Special Track, Student Papers and Demonstrations

Section

AAAI Undergraduate Consortium

Using Random Perturbations to Mitigate Adversarial Attacks on NLP Models

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription