TraceEvader: Making DeepFakes More Untraceable via Evading the Forgery Model Attribution

Authors

  • Mengjie Wu Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China
  • Jingui Ma Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China
  • Run Wang Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China
  • Sidan Zhang Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China
  • Ziyou Liang Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China
  • Boheng Li Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China
  • Chenhao Lin Xi'an Jiaotong University
  • Liming Fang Nanjing University of Aeronautics and Astronautics
  • Lina Wang Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, China Zhengzhou Xinda Institute of Advanced Technology

DOI:

https://doi.org/10.1609/aaai.v38i18.29973

Keywords:

PEAI: Applications, APP: Misinformation & Fake News, PEAI: Societal Impact of AI

Abstract

In recent few years, DeepFakes are posing serve threats and concerns to both individuals and celebrities, as realistic DeepFakes facilitate the spread of disinformation. Model attribution techniques aim at attributing the adopted forgery models of DeepFakes for provenance purposes and providing explainable results to DeepFake forensics. However, the existing model attribution techniques rely on the trace left in the DeepFake creation, which can become futile if such traces were disrupted. Motivated by our observation that certain traces served for model attribution appeared in both the high-frequency and low-frequency domains and play a divergent role in model attribution. In this work, for the first time, we propose a novel training-free evasion attack, TraceEvader, in the most practical non-box setting. Specifically, TraceEvader injects a universal imitated traces learned from wild DeepFakes into the high-frequency component and introduces adversarial blur into the domain of the low-frequency component, where the added distortion confuses the extraction of certain traces for model attribution. The comprehensive evaluation on 4 state-of-the-art (SOTA) model attribution techniques and fake images generated by 8 generative models including generative adversarial networks (GANs) and diffusion models (DMs) demonstrates the effectiveness of our method. Overall, our TraceEvader achieves the highest average attack success rate of 79% and is robust against image transformations and dedicated denoising techniques as well where the average attack success rate is still around 75%. Our TraceEvader confirms the limitations of current model attribution techniques and calls the attention of DeepFake researchers and practitioners for more robust-purpose model attribution techniques.

Published

2024-03-24

How to Cite

Wu, M., Ma, J., Wang, R., Zhang, S., Liang, Z., Li, B., Lin, C., Fang, L., & Wang, L. (2024). TraceEvader: Making DeepFakes More Untraceable via Evading the Forgery Model Attribution. Proceedings of the AAAI Conference on Artificial Intelligence, 38(18), 19965-19973. https://doi.org/10.1609/aaai.v38i18.29973

Issue

Section

AAAI Technical Track on Philosophy and Ethics of AI