17
0

On the Security Risks of ML-based Malware Detection Systems: A Survey

Abstract

Malware presents a persistent threat to user privacy and data integrity. To combat this, machine learning-based (ML-based) malware detection (MD) systems have been developed. However, these systems have increasingly been attacked in recent years, undermining their effectiveness in practice. While the security risks associated with ML-based MD systems have garnered considerable attention, the majority of prior works is limited to adversarial malware examples, lacking a comprehensive analysis of practical security risks. This paper addresses this gap by utilizing the CIA principles to define the scope of security risks. We then deconstruct ML-based MD systems into distinct operational stages, thus developing a stage-based taxonomy. Utilizing this taxonomy, we summarize the technical progress and discuss the gaps in the attack and defense proposals related to the ML-based MD systems within each stage. Subsequently, we conduct two case studies, using both inter-stage and intra-stage analyses according to the stage-based taxonomy to provide new empirical insights. Based on these analyses and insights, we suggest potential future directions from both inter-stage and intra-stage perspectives.

View on arXiv
@article{he2025_2505.10903,
  title={ On the Security Risks of ML-based Malware Detection Systems: A Survey },
  author={ Ping He and Yuhao Mao and Changjiang Li and Lorenzo Cavallaro and Ting Wang and Shouling Ji },
  journal={arXiv preprint arXiv:2505.10903},
  year={ 2025 }
}
Comments on this paper