Bigram and Unigram Based Text Attack via Adaptive Monotonic Heuristic Search

Xinghao Yang; Weifeng Liu; James Bailey; Dacheng Tao; Wei Liu

doi:10.1609/aaai.v35i1.16151

Authors

Xinghao Yang University of Technology Sydney
Weifeng Liu China University of Petroleum (East China)
James Bailey THE UNIVERSITY OF MELBOURNE
Dacheng Tao The University of Sydney
Wei Liu University of Technology Sydney

DOI:

https://doi.org/10.1609/aaai.v35i1.16151

Keywords:

Security, Adversarial Learning & Robustness, Adversarial Attacks & Robustness

Abstract

Deep neural networks (DNNs) are known to be vulnerable to adversarial images, while their robustness in text classification are rarely studied. Several lines of text attack methods have been proposed in the literature, such as character-level, word-level, and sentence-level attacks. However, it is still a challenge to minimize the number of word distortions necessary to induce misclassification, while simultaneously ensuring the lexical correctness, syntactic correctness, and semantic similarity. In this paper, we propose the Bigram and Unigram based Monotonic Heuristic Search (BU-MHS) method to examine the vulnerability of deep models. Our method has three major merits. Firstly, we propose to attack text documents not only at the unigram word level but also at the bigram level to avoid producing meaningless outputs. Secondly, we propose a hybrid method to replace the input words with both their synonyms and sememe candidates, which greatly enriches potential substitutions compared to only using synonyms. Lastly, we design a search algorithm, i.e., Monotonic Heuristic Search (MHS), to determine the priority of word replacements, aiming to reduce the modification cost in an adversarial attack. We evaluate the effectiveness of BU-MHS on IMDB, AG's News, and Yahoo! Answers text datasets by attacking four state-of-the-art DNNs models. Experimental results show that our BU-MHS achieves the highest attack success rate by changing the smallest number of words compared with other existing models.

Bigram and Unigram Based Text Attack via Adaptive Monotonic Heuristic Search

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription