Reality Check: Measuring Real-World Applicability of State-of-the-Art Audio Deepfake Detectors on Social Media Data

Authors

  • Karla Schäfer Fraunhofer Institute for Secure Information Technology National Research Center for Applied Cybersecurity
  • Martin Steinebach Fraunhofer Institute for Secure Information Technology National Research Center for Applied Cybersecurity

DOI:

https://doi.org/10.1609/icwsm.v20i1.42803

Abstract

Audio deepfakes are becoming both more realistic and easier to create. At the same time, several audio deepfake detectors have been developed. While some of these have been evaluated using real-world data, there has been no in-depth analysis of their performance in real-world settings. We evaluate five SOTA detectors on two real-world social media datasets. Currently, the equal error rate (EER) is mostly used to evaluate audio deepfake detectors. However, when using the EER, the threshold for classifying whether a recording is genuine or not is calculated based on the prediction scores of the test set. In real-world scenarios, this threshold must be set in advance. We are the first to test the performance of SOTA detectors using varying, beforehand set, thresholds, thereby creating a real-world setting. We found degradations on the ITW test set (e.g. F1: 91.35%- 64.65%) when using other thresholds as set with EER calculation. The SocialDF dataset was found to be especially challenging, with an F1-score of 52.92% achieved using an EER threshold. Using pre-set thresholds resulted in an even lower performance of 50.89%, demonstrating that current detectors are unable to reliably detect real-world audio deepfakes.

Downloads

Published

2026-05-25

How to Cite

Schäfer, K., & Steinebach, M. (2026). Reality Check: Measuring Real-World Applicability of State-of-the-Art Audio Deepfake Detectors on Social Media Data. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 3031–3036. https://doi.org/10.1609/icwsm.v20i1.42803