ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.19068
33
0

Sketching Algorithms for Sparse Dictionary Learning: PTAS and Turnstile Streaming

29 October 2023
Gregory Dexter
P. Drineas
David P. Woodruff
T. Yasuda
ArXiv (abs)PDFHTML
Abstract

Sketching algorithms have recently proven to be a powerful approach both for designing low-space streaming algorithms as well as fast polynomial time approximation schemes (PTAS). In this work, we develop new techniques to extend the applicability of sketching-based approaches to the sparse dictionary learning and the Euclidean kkk-means clustering problems. In particular, we initiate the study of the challenging setting where the dictionary/clustering assignment for each of the nnn input points must be output, which has surprisingly received little attention in prior work. On the fast algorithms front, we obtain a new approach for designing PTAS's for the kkk-means clustering problem, which generalizes to the first PTAS for the sparse dictionary learning problem. On the streaming algorithms front, we obtain new upper bounds and lower bounds for dictionary learning and kkk-means clustering. In particular, given a design matrix A∈Rn×d\mathbf A\in\mathbb R^{n\times d}A∈Rn×d in a turnstile stream, we show an O~(nr/ϵ2+dk/ϵ)\tilde O(nr/\epsilon^2 + dk/\epsilon)O~(nr/ϵ2+dk/ϵ) space upper bound for rrr-sparse dictionary learning of size kkk, an O~(n/ϵ2+dk/ϵ)\tilde O(n/\epsilon^2 + dk/\epsilon)O~(n/ϵ2+dk/ϵ) space upper bound for kkk-means clustering, as well as an O~(n)\tilde O(n)O~(n) space upper bound for kkk-means clustering on random order row insertion streams with a natural "bounded sensitivity" assumption. On the lower bounds side, we obtain a general Ω~(n/ϵ+dk/ϵ)\tilde\Omega(n/\epsilon + dk/\epsilon)Ω~(n/ϵ+dk/ϵ) lower bound for kkk-means clustering, as well as an Ω~(n/ϵ2)\tilde\Omega(n/\epsilon^2)Ω~(n/ϵ2) lower bound for algorithms which can estimate the cost of a single fixed set of candidate centers.

View on arXiv
Comments on this paper