Modulation-Based Backdoors: Leveraging Amplitude and Frequency Patterns to Attack Speaker Recognition

Hanbo Cai; Pengcheng Zhang; Yan Xiao; De Li; Hanting Chu; Ying Luo

doi:10.1609/aaai.v40i1.36961

Authors

Hanbo Cai College of Computer Science and Software Engineering, Hohai University, Nanjing, Jiangsu, China College of Artificial Intelligence, Suzhou Vocational Institute of Industrial Technology, Suzhou, Jiangsu, China
Pengcheng Zhang College of Computer Science and Software Engineering, Hohai University, Nanjing, Jiangsu, China
Yan Xiao School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
De Li Computer Science and Engineering, Guangxi Normal University, Guilin, China
Hanting Chu School of Mathematics and Computer Science, Zhejiang Agriculture and Forestry University, Hangzhou, Zhejiang, China
Ying Luo College of Artificial Intelligence, Suzhou Vocational Institute of Industrial Technology, Suzhou, Jiangsu, China

DOI:

https://doi.org/10.1609/aaai.v40i1.36961

Abstract

Deep neural networks (DNNs) are widely and successfully applied in the field of speaker recognition. However, recent studies reveal that these models are vulnerable to backdoor attacks, where adversaries inject malicious behaviors into victim models by poisoning the training process. Existing attack methods often rely on environmental noise or complex voice transformations, which are typically difficult to implement and exhibit poor stealthiness. To address these issues, this paper proposes two modulation-based backdoor attacks that leverage frequency modulation (FM) and amplitude modulation (AM) to construct audio triggers. In real-world scenarios, regular variations in frequency and amplitude are often imperceptible to human listeners, making the proposed attacks more covert. Experimental results show that our methods achieve high attack success rates in both digital and physical settings, while also demonstrating strong resistance to various state-of-the-art backdoor defenses.

Modulation-Based Backdoors: Leveraging Amplitude and Frequency Patterns to Attack Speaker Recognition

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information