RADEP: A Resilient Adaptive Defense Framework Against Model Extraction Attacks

25 May 2025

Abstract

Machine Learning as a Service (MLaaS) enables users to leverage powerful machine learning models through cloud-based APIs, offering scalability and ease of deployment. However, these services are vulnerable to model extraction attacks, where adversaries repeatedly query the application programming interface (API) to reconstruct a functionally similar model, compromising intellectual property and security. Despite various defense strategies being proposed, many suffer from high computational costs, limited adaptability to evolving attack techniques, and a reduction in performance for legitimate users. In this paper, we introduce a Resilient Adaptive Defense Framework for Model Extraction Attack Protection (RADEP), a multifaceted defense framework designed to counteract model extraction attacks through a multi-layered security approach. RADEP employs progressive adversarial training to enhance model resilience against extraction attempts. Malicious query detection is achieved through a combination of uncertainty quantification and behavioral pattern analysis, effectively identifying adversarial queries. Furthermore, we develop an adaptive response mechanism that dynamically modifies query outputs based on their suspicion scores, reducing the utility of stolen models. Finally, ownership verification is enforced through embedded watermarking and backdoor triggers, enabling reliable identification of unauthorized model use. Experimental evaluations demonstrate that RADEP significantly reduces extraction success rates while maintaining high detection accuracy with minimal impact on legitimate queries. Extensive experiments show that RADEP effectively defends against model extraction attacks and remains resilient even against adaptive adversaries, making it a reliable security framework for MLaaS models.

View on arXiv

@article{chakraborty2025_2505.19364,
  title={ RADEP: A Resilient Adaptive Defense Framework Against Model Extraction Attacks },
  author={ Amit Chakraborty and Sayyed Farid Ahamed and Sandip Roy and Soumya Banerjee and Kevin Choi and Abdul Rahman and Alison Hu and Edward Bowen and Sachin Shetty },
  journal={arXiv preprint arXiv:2505.19364},
  year={ 2025 }
}

Comments on this paper