As malware detection evolves, attackers adopt sophisticated evasion tactics. Traditional file-level fingerprinting, such as cryptographic and fuzzy hashes, is often overlooked as a target for evasion. Malware variants exploit minor binary modifications to bypass detection, as seen in Microsoft's discovery of GoldMax variations (2020-2021). However, no large-scale empirical studies have assessed the limitations of traditional fingerprinting methods on real-world malware samples or explored improvements.This paper fills this gap by addressing three key questions: (a) How prevalent are file variants in malware samples? Analyzing 4 million Windows Portable Executable (PE) files, 21 million sections, and 48 million resources, we find up to 80% deep structural similarities, including common APIs and executable sections. (b) What evasion techniques are used? We identify resilient fingerprints (clusters of malware variants with high similarity) validated via VirusTotal. Our analysis reveals non-functional mutations, such as altered section numbers, virtual sizes, and section names, as primary evasion tactics. We also classify two key section types: malicious sections (high entropy >5) and camouflage sections (entropy = 0). (c) How can fingerprinting be improved? We propose two novel approaches that enhance detection, improving identification rates from 20% (traditional methods) to over 50% using our refined fingerprinting techniques.Our findings highlight the limitations of existing methods and propose new strategies to strengthen malware fingerprinting against evolving threats.
View on arXiv@article{abuadbba2025_2503.06495, title={ Enhancing Malware Fingerprinting through Analysis of Evasive Techniques }, author={ Alsharif Abuadbba and Sean Lamont and Ejaz Ahmed and Cody Christopher and Muhammad Ikram and Uday Tupakula and Daniel Coscia and Mohamed Ali Kaafar and Surya Nepal }, journal={arXiv preprint arXiv:2503.06495}, year={ 2025 } }