A novel TLS-based Fingerprinting approach that combines feature expansion and similarity mapping

Malicious domains are part of the landscape of the internet but are becoming more prevalent and more dangerous to both companies and individuals. They can be hosted on variety of technologies and serve an array of content, ranging from Malware, command and control, and complex Phishing sites that are designed to deceive and expose. Tracking, blocking and detecting such domains is complex, and very often involves complex allow or deny list management or SIEM integration with open-source TLS fingerprinting techniques. Many fingerprint techniques such as JARM and JA3 are used by threat hunters to determine domain classification, but with the increase in TLS similarity, particularly in CDNs, they are becoming less useful. The aim of this paper is to adapt and evolve open-source TLS fingerprinting techniques with increased features to enhance granularity, and to produce a similarity mapping system that enables the tracking and detection of previously unknown malicious domains. This is done by enriching TLS fingerprints with HTTP header data and producing a fine grain similarity visualisation that represented high dimensional data using MinHash and local sensitivity hashing. Influence was taken from the Chemistry domain, where the problem of high dimensional similarity in chemical fingerprints is often encountered. An enriched fingerprint was produced which was then visualised across three separate datasets. The results were analysed and evaluated, with 67 previously unknown malicious domains being detected based on their similarity to known malicious domains and nothing else. The similarity mapping technique produced demonstrates definite promise in the arena of early detection of Malware and Phishing domains.
View on arXiv