SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack

Authors

  • Han Liu Dalian University of Technology
  • Zhi Xu Dalian University of Technology
  • Xiaotong Zhang Dalian University of Technology
  • Xiaoming Xu Dalian University of Technology
  • Feng Zhang Peking University
  • Fenglong Ma The Pennsylvania State University
  • Hongyang Chen Zhejiang Lab
  • Hong Yu Dalian University of Technology
  • Xianchao Zhang Dalian University of Technology

DOI:

https://doi.org/10.1609/aaai.v37i11.26553

Keywords:

SNLP: Adversarial Attacks & Robustness

Abstract

Hard-label textual adversarial attack is a challenging task, as only the predicted label information is available, and the text space is discrete and non-differentiable. Relevant research work is still in fancy and just a handful of methods are proposed. However, existing methods suffer from either the high complexity of genetic algorithms or inaccurate gradient estimation, thus are arduous to obtain adversarial examples with high semantic similarity and low perturbation rate under the tight-budget scenario. In this paper, we propose a simple and sweet paradigm for hard-label textual adversarial attack, named SSPAttack. Specifically, SSPAttack first utilizes initialization to generate an adversarial example, and removes unnecessary replacement words to reduce the number of changed words. Then it determines the replacement order and searches for an anchor synonym, thus avoiding going through all the synonyms. Finally, it pushes substitution words towards original words until an appropriate adversarial example is obtained. The core idea of SSPAttack is just swapping words whose mechanism is simple. Experimental results on eight benchmark datasets and two real-world APIs have shown that the performance of SSPAttack is sweet in terms of similarity, perturbation rate and query efficiency.

Downloads

Published

2023-06-26

How to Cite

Liu, H., Xu, Z., Zhang, X., Xu, X., Zhang, F., Ma, F., Chen, H., Yu, H., & Zhang, X. (2023). SSPAttack: A Simple and Sweet Paradigm for Black-Box Hard-Label Textual Adversarial Attack. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 13228-13235. https://doi.org/10.1609/aaai.v37i11.26553

Issue

Section

AAAI Technical Track on Speech & Natural Language Processing