ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1904.12728
8
5
v1v2 (latest)

Accurate MapReduce Algorithms for kkk-median and kkk-means in General Metric Spaces

29 April 2019
Alessio Mazzetto
A. Pietracaprina
G. Pucci
ArXiv (abs)PDFHTML
Abstract

Center-based clustering is a fundamental primitive for data analysis and becomes very challenging for large datasets. In this paper, we focus on the popular kkk-median and kkk-means variants which, given a set PPP of points from a metric space and a parameter k<∣P∣k<|P|k<∣P∣, require to identify a set SSS of kkk centers minimizing, respectively, the sum of the distances and of the squared distances of all points in PPP from their closest centers. Our specific focus is on general metric spaces, for which it is reasonable to require that the centers belong to the input set (i.e., S⊆PS \subseteq PS⊆P). We present coreset-based 3-round distributed approximation algorithms for the above problems using the MapReduce computational model. The algorithms are rather simple and obliviously adapt to the intrinsic complexity of the dataset, captured by the doubling dimension DDD of the metric space. Remarkably, the algorithms attain approximation ratios that can be made arbitrarily close to those achievable by the best known polynomial-time sequential approximations, and they are very space efficient for small DDD, requiring local memory sizes substantially sublinear in the input size. To the best of our knowledge, no previous distributed approaches were able to attain similar quality-performance guarantees in general metric spaces.

View on arXiv
Comments on this paper