The Fast Cauchy Transform: with Applications to Basis Construction, Regression, and Subspace Approximation in L1

19 July 2012

K. Clarkson

Abstract

We give fast algorithms for $\ell_p$ regression and related problems: for an $n\times d$ input matrix $A$ and vector $b\in\R^n$ , in $O(nd\log n)$ time we reduce the problem $\min_{x\in\R^d} \norm{Ax-b}_p$ to the same problem with input matrix $\tilde A$ of dimension $S \times d$ and corresponding $\tilde b$ of dimension $S\times 1$ ; $\tilde A$ and $\tilde b$ are a \emph{coreset} for the problem, consisting of sampled and rescaled rows of $A$ and $b$ . Here $S$ is independent of $n$ , and polynomial in $d$ . Our results improve on the best previous algorithms when $n\gg d$ , for all $p\in [1,\infty)$ except $p=2$ , in particular the $O(nd^{1.376+})$ running time of Sohler and Woodruff (STOC, 2011) for $p=1$ , that uses asymptotically fast matrix multiplication, and the $O(nd^5\log n)$ time of Dasgupta \emph{et al.} (SODA, 2008) for general $p$ . We also give a detailed empirical evaluation of implementations of our algorithms for $p=1$ , comparing them with several related algorithms. Among other things, our results clearly show that the practice follows the theory closely, in the asymptotic regime. In addition, we show near-optimal results for $\ell_1$ regression problems that are too large for any prior solution methods. Our algorithms use our faster constructions of well-conditioned bases for $\ell_p$ spaces, and for $p=1$ , a fast subspace embedding: a matrix $\Pi: \R^n\mapsto \R^{O(d\log d)}$ , found obliviously to $A$ , that approximately preserves the $\ell_1$ norms of all vectors in $\{Ax\mid x\in\R^d\}$ ; that is, $\norm{Ax}_1 \approx \norm{\Pi Ax}_1$ , for all $x$ , with distortion $\tilde O(d^2)$ . Moreover, $\Pi A$ can be computed in $O(nd\log d)$ time. Our techniques include fast Johnson-Lindenstrauss transforms, low coherence matrices, and rescaling by Cauchy random variables.

View on arXiv

Comments on this paper