103
14

A Nearly-Optimal Bound for Fast Regression with \ell_\infty Guarantee

Abstract

Given a matrix ARn×dA\in \mathbb{R}^{n\times d} and a vector bRnb\in \mathbb{R}^n, we consider the regression problem with \ell_\infty guarantees: finding a vector xRdx'\in \mathbb{R}^d such that xxϵdAxb2A \|x'-x^*\|_\infty \leq \frac{\epsilon}{\sqrt{d}}\cdot \|Ax^*-b\|_2\cdot \|A^\dagger\| where x=argminxRdAxb2x^*=\arg\min_{x\in \mathbb{R}^d}\|Ax-b\|_2. One popular approach for solving such 2\ell_2 regression problem is via sketching: picking a structured random matrix SRm×nS\in \mathbb{R}^{m\times n} with mnm\ll n and SASA can be quickly computed, solve the ``sketched'' regression problem argminxRdSAxSb2\arg\min_{x\in \mathbb{R}^d} \|SAx-Sb\|_2. In this paper, we show that in order to obtain such \ell_\infty guarantee for 2\ell_2 regression, one has to use sketching matrices that are dense. To the best of our knowledge, this is the first user case in which dense sketching matrices are necessary. On the algorithmic side, we prove that there exists a distribution of dense sketching matrices with m=ϵ2dlog3(n/δ)m=\epsilon^{-2}d\log^3(n/\delta) such that solving the sketched regression problem gives the \ell_\infty guarantee, with probability at least 1δ1-\delta. Moreover, the matrix SASA can be computed in time O(ndlogn)O(nd\log n). Our row count is nearly-optimal up to logarithmic factors, and significantly improves the result in [Price, Song and Woodruff, ICALP'17], in which a super-linear in dd rows, m=Ω(ϵ2d1+γ)m=\Omega(\epsilon^{-2}d^{1+\gamma}) for γ=Θ(loglognlogd)\gamma=\Theta(\sqrt{\frac{\log\log n}{\log d}}) is required. We also develop a novel analytical framework for \ell_\infty guarantee regression that utilizes the Oblivious Coordinate-wise Embedding (OCE) property introduced in [Song and Yu, ICML'21]. Our analysis is arguably much simpler and more general than [Price, Song and Woodruff, ICALP'17], and it extends to dense sketches for tensor product of vectors.

View on arXiv
Comments on this paper