Traditional k-means clustering underperforms on non-convex shapes and requires the number of clusters k to be specified in advance. We propose a simple geometric enhancement: after standard k-means, each cluster center is assigned a radius (the distance to its farthest assigned point), and clusters whose radii overlap are merged. This post-processing step loosens the requirement for exact k: as long as k is overestimated (but not excessively), the method can often reconstruct non-convex shapes through meaningful merges. We also show that this approach supports recursive partitioning: clustering can be performed independently on tiled regions of the feature space, then globally merged, making the method scalable and suitable for distributed systems. Implemented as a lightweight post-processing step atop scikit-learn's k-means, the algorithm performs well on benchmark datasets, achieving high accuracy with minimal additional computation.
View on arXiv@article{kober2025_2504.20293, title={ Radius-Guided Post-Clustering for Shape-Aware, Scalable Refinement of k-Means Results }, author={ Stefan Kober }, journal={arXiv preprint arXiv:2504.20293}, year={ 2025 } }