Reality Check: Measuring Real-World Applicability of State-of-the-Art Audio Deepfake Detectors on Social Media Data

Karla Schäfer; Martin Steinebach

doi:10.1609/icwsm.v20i1.42803

Authors

Karla Schäfer Fraunhofer Institute for Secure Information Technology National Research Center for Applied Cybersecurity
Martin Steinebach Fraunhofer Institute for Secure Information Technology National Research Center for Applied Cybersecurity

DOI:

https://doi.org/10.1609/icwsm.v20i1.42803

Abstract

Audio deepfakes are becoming both more realistic and easier to create. At the same time, several audio deepfake detectors have been developed. While some of these have been evaluated using real-world data, there has been no in-depth analysis of their performance in real-world settings. We evaluate five SOTA detectors on two real-world social media datasets. Currently, the equal error rate (EER) is mostly used to evaluate audio deepfake detectors. However, when using the EER, the threshold for classifying whether a recording is genuine or not is calculated based on the prediction scores of the test set. In real-world scenarios, this threshold must be set in advance. We are the first to test the performance of SOTA detectors using varying, beforehand set, thresholds, thereby creating a real-world setting. We found degradations on the ITW test set (e.g. F1: 91.35%- 64.65%) when using other thresholds as set with EER calculation. The SocialDF dataset was found to be especially challenging, with an F1-score of 52.92% achieved using an EER threshold. Using pre-set thresholds resulted in an even lower performance of 50.89%, demonstrating that current detectors are unable to reliably detect real-world audio deepfakes.

Reality Check: Measuring Real-World Applicability of State-of-the-Art Audio Deepfake Detectors on Social Media Data

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information