Modulation-Based Backdoors: Leveraging Amplitude and Frequency Patterns to Attack Speaker Recognition
DOI:
https://doi.org/10.1609/aaai.v40i1.36961Abstract
Deep neural networks (DNNs) are widely and successfully applied in the field of speaker recognition. However, recent studies reveal that these models are vulnerable to backdoor attacks, where adversaries inject malicious behaviors into victim models by poisoning the training process. Existing attack methods often rely on environmental noise or complex voice transformations, which are typically difficult to implement and exhibit poor stealthiness. To address these issues, this paper proposes two modulation-based backdoor attacks that leverage frequency modulation (FM) and amplitude modulation (AM) to construct audio triggers. In real-world scenarios, regular variations in frequency and amplitude are often imperceptible to human listeners, making the proposed attacks more covert. Experimental results show that our methods achieve high attack success rates in both digital and physical settings, while also demonstrating strong resistance to various state-of-the-art backdoor defenses.Published
2026-03-14
How to Cite
Cai, H., Zhang, P., Xiao, Y., Li, D., Chu, H., & Luo, Y. (2026). Modulation-Based Backdoors: Leveraging Amplitude and Frequency Patterns to Attack Speaker Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 30–38. https://doi.org/10.1609/aaai.v40i1.36961
Issue
Section
AAAI Technical Track on Application Domains I