ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.03193
22
22

Explainable k-means. Don't be greedy, plant bigger trees!

4 November 2021
K. Makarychev
Liren Shan
ArXivPDFHTML
Abstract

We provide a new bi-criteria O~(log⁡2k)\tilde{O}(\log^2 k)O~(log2k) competitive algorithm for explainable kkk-means clustering. Explainable kkk-means was recently introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian (ICML 2020). It is described by an easy to interpret and understand (threshold) decision tree or diagram. The cost of the explainable kkk-means clustering equals to the sum of costs of its clusters; and the cost of each cluster equals the sum of squared distances from the points in the cluster to the center of that cluster. The best non bi-criteria algorithm for explainable clustering O~(k)\tilde{O}(k)O~(k) competitive, and this bound is tight. Our randomized bi-criteria algorithm constructs a threshold decision tree that partitions the data set into (1+δ)k(1+\delta)k(1+δ)k clusters (where δ∈(0,1)\delta\in (0,1)δ∈(0,1) is a parameter of the algorithm). The cost of this clustering is at most O~(1/δ⋅log⁡2k)\tilde{O}(1/ \delta \cdot \log^2 k)O~(1/δ⋅log2k) times the cost of the optimal unconstrained kkk-means clustering. We show that this bound is almost optimal.

View on arXiv
Comments on this paper