0
0

Guessing Efficiently for Constrained Subspace Approximation

Aditya Bhaskara
Sepideh Mahabadi
Madhusudhan Reddy Pittu
Ali Vakilian
David P. Woodruff
Abstract

In this paper we study constrained subspace approximation problem. Given a set of nn points {a1,,an}\{a_1,\ldots,a_n\} in Rd\mathbb{R}^d, the goal of the {\em subspace approximation} problem is to find a kk dimensional subspace that best approximates the input points. More precisely, for a given p1p\geq 1, we aim to minimize the ppth power of the p\ell_p norm of the error vector (a1Pa1,,anPan)(\|a_1-\bm{P}a_1\|,\ldots,\|a_n-\bm{P}a_n\|), where P\bm{P} denotes the projection matrix onto the subspace and the norms are Euclidean. In \emph{constrained} subspace approximation (CSA), we additionally have constraints on the projection matrix P\bm{P}. In its most general form, we require P\bm{P} to belong to a given subset S\mathcal{S} that is described explicitly or implicitly.We introduce a general framework for constrained subspace approximation. Our approach, that we term coreset-guess-solve, yields either (1+ε)(1+\varepsilon)-multiplicative or ε\varepsilon-additive approximations for a variety of constraints. We show that it provides new algorithms for partition-constrained subspace approximation with applications to {\it fair} subspace approximation, kk-means clustering, and projected non-negative matrix factorization, among others. Specifically, while we reconstruct the best known bounds for kk-means clustering in Euclidean spaces, we improve the known results for the remainder of the problems.

View on arXiv
@article{bhaskara2025_2504.20883,
  title={ Guessing Efficiently for Constrained Subspace Approximation },
  author={ Aditya Bhaskara and Sepideh Mahabadi and Madhusudhan Reddy Pittu and Ali Vakilian and David P. Woodruff },
  journal={arXiv preprint arXiv:2504.20883},
  year={ 2025 }
}
Comments on this paper