Hierarchical clustering with maximum density paths and mixture models

Hierarchical clustering is an effective, interpretable method for analyzing structure in data. It reveals insights at multiple scales without requiring a predefined number of clusters and captures nested patterns and subtle relationships, which are often missed by flat clustering approaches. However, existing hierarchical clustering methods struggle with high-dimensional data, especially when there are no clear density gaps between modes. In this work, we introduce t-NEB, a probabilistically grounded hierarchical clustering method, which yields state-of-the-art clustering performance on naturalistic high-dimensional data. t-NEB consists of three steps: (1) density estimation via overclustering; (2) finding maximum density paths between clusters; (3) creating a hierarchical structure via bottom-up cluster merging. t-NEB uses a probabilistic parametric density model for both overclustering and cluster merging, which yields both high clustering performance and a meaningful hierarchy, making it a valuable tool for exploratory data analysis. Code is available atthis https URLclustering.
View on arXiv@article{ritzert2025_2503.15582, title={ Hierarchical clustering with maximum density paths and mixture models }, author={ Martin Ritzert and Polina Turishcheva and Laura Hansel and Paul Wollenhaupt and Marissa A. Weis and Alexander S. Ecker }, journal={arXiv preprint arXiv:2503.15582}, year={ 2025 } }