14
4

Coreset-based Strategies for Robust Center-type Problems

Abstract

Given a dataset VV of points from some metric space, the popular kk-center problem requires to identify a subset of kk points (centers) in VV minimizing the maximum distance of any point of VV from its closest center. The \emph{robust} formulation of the problem features a further parameter zz and allows up to zz points of VV (outliers) to be disregarded when computing the maximum distance from the centers. In this paper, we focus on two important constrained variants of the robust kk-center problem, namely, the Robust Matroid Center (RMC) problem, where the set of returned centers are constrained to be an independent set of a matroid of rank kk built on VV, and the Robust Knapsack Center (RKC) problem, where each element iVi\in V is given a positive weight wi<1w_i<1 and the aggregate weight of the returned centers must be at most 1. We devise coreset-based strategies for the two problems which yield efficient sequential, MapReduce, and Streaming algorithms. More specifically, for any fixed ϵ>0\epsilon>0, the algorithms return solutions featuring a (3+ϵ)(3+\epsilon)-approximation ratio, which is a mere additive term ϵ\epsilon away from the 3-approximations achievable by the best known polynomial-time sequential algorithms for the two problems. Moreover, the algorithms obliviously adapt to the intrinsic complexity of the dataset, captured by its doubling dimension DD. For wide ranges of the parameters k,z,ϵ,Dk,z,\epsilon, D, we obtain a sequential algorithm with running time linear in V|V|, and MapReduce/Streaming algorithms with few rounds/passes and substantially sublinear local/working memory.

View on arXiv
Comments on this paper