19
2

Coresets for constrained k-median and k-means clustering in low dimensional Euclidean space

Abstract

We study (Euclidean) kk-median and kk-means with constraints in the streaming model. There have been recent efforts to design unified algorithms to solve constrained kk-means problems without using knowledge of the specific constraint at hand aside from mild assumptions like the polynomial computability of feasibility under the constraint (compute if a clustering satisfies the constraint) or the presence of an efficient assignment oracle (given a set of centers, produce an optimal assignment of points to the centers which satisfies the constraint). These algorithms have a running time exponential in kk, but can be applied to a wide range of constraints. We demonstrate that a technique proposed in 2019 for solving a specific constrained streaming kk-means problem, namely fair kk-means clustering, actually implies streaming algorithms for all these constraints. These work for low dimensional Euclidean space. [Note that there are more algorithms for streaming fair kk-means today, in particular they exist for high dimensional spaces now as well.]

View on arXiv
Comments on this paper