55
20

New Coresets for Projective Clustering and Applications

Abstract

(j,k)(j,k)-projective clustering is the natural generalization of the family of kk-clustering and jj-subspace clustering problems. Given a set of points PP in Rd\mathbb{R}^d, the goal is to find kk flats of dimension jj, i.e., affine subspaces, that best fit PP under a given distance measure. In this paper, we propose the first algorithm that returns an LL_\infty coreset of size polynomial in dd. Moreover, we give the first strong coreset construction for general MM-estimator regression. Specifically, we show that our construction provides efficient coreset constructions for Cauchy, Welsch, Huber, Geman-McClure, Tukey, L1L2L_1-L_2, and Fair regression, as well as general concave and power-bounded loss functions. Finally, we provide experimental results based on real-world datasets, showing the efficacy of our approach.

View on arXiv
Comments on this paper