Threats from the internet, particularly malicious software (i.e., malware) often use cryptographic algorithms to disguise their actions and even to take control of a victim's system (as in the case of ransomware). Malware and other threats proliferate too quickly for the time-consuming traditional methods of binary analysis to be effective. By automating detection and classification of cryptographic algorithms, we can speed program analysis and more efficiently combat malware. This thesis will present several methods of leveraging machine learning to automatically discover and classify cryptographic algorithms in compiled binary programs. While further work is necessary to fully evaluate these methods on real-world binary programs, the results in this paper suggest that machine learning can be used successfully to detect and identify cryptographic primitives in compiled code. Currently, these techniques successfully detect and classify cryptographic algorithms in small single-purpose programs, and further work is proposed to apply them to real-world examples.
View on arXiv