Coresets for constrained k-median and k-means clustering in low dimensional Euclidean space

We study (Euclidean) -median and -means with constraints in the streaming model. There have been recent efforts to design unified algorithms to solve constrained -means problems without using knowledge of the specific constraint at hand aside from mild assumptions like the polynomial computability of feasibility under the constraint (compute if a clustering satisfies the constraint) or the presence of an efficient assignment oracle (given a set of centers, produce an optimal assignment of points to the centers which satisfies the constraint). These algorithms have a running time exponential in , but can be applied to a wide range of constraints. We demonstrate that a technique proposed in 2019 for solving a specific constrained streaming -means problem, namely fair -means clustering, actually implies streaming algorithms for all these constraints. These work for low dimensional Euclidean space. [Note that there are more algorithms for streaming fair -means today, in particular they exist for high dimensional spaces now as well.]
View on arXiv