Replay Attacks Against Audio Deepfake Detection

20 May 2025

Abstract

We show how replay attacks undermine audio deepfake detection: By playing and re-recording deepfake audio through various speakers and microphones, we make spoofed samples appear authentic to the detection model. To study this phenomenon in more detail, we introduce ReplayDF, a dataset of recordings derived from M-AILABS and MLAAD, featuring 109 speaker-microphone combinations across six languages and four TTS models. It includes diverse acoustic conditions, some highly challenging for detection. Our analysis of six open-source detection models across five datasets reveals significant vulnerability, with the top-performing W2V2-AASIST model's Equal Error Rate (EER) surging from 4.7% to 18.2%. Even with adaptive Room Impulse Response (RIR) retraining, performance remains compromised with an 11.0% EER. We release ReplayDF for non-commercial research use.

View on arXiv

@article{müller2025_2505.14862,
  title={ Replay Attacks Against Audio Deepfake Detection },
  author={ Nicolas Müller and Piotr Kawa and Wei-Herng Choong and Adriana Stan and Aditya Tirumala Bukkapatnam and Karla Pizzi and Alexander Wagner and Philip Sperl },
  journal={arXiv preprint arXiv:2505.14862},
  year={ 2025 }
}

Comments on this paper