25
0

A spectral least-squares-type method for heavy-tailed corrupted regression with unknown covariance \& heterogeneous noise

Abstract

We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted nn-sized label-feature sample of at most ϵn\epsilon n arbitrary outliers. We wish to estimate a pp-dimensional parameter bb^* given such sample of a label-feature pair (y,x)(y,x) satisfying y=x,b+ξy=\langle x,b^*\rangle+\xi with heavy-tailed (x,ξ)(x,\xi). We only assume xx is L4L2L^4-L^2 hypercontractive with constant L>0L>0 and has covariance matrix Σ\Sigma with minimum eigenvalue 1/μ2>01/\mu^2>0 and bounded condition number κ>0\kappa>0. The noise ξ\xi can be arbitrarily dependent on xx and nonsymmetric as long as ξx\xi x has finite covariance matrix Ξ\Xi. We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on (Σ,Ξ)(\Sigma,\Xi) nor the operator norm of Ξ\Xi. With probability at least 1δ1-\delta, our proposed estimator attains the statistical rate μ2Ξ1/2(pn+log(1/δ)n+ϵ)1/2\mu^2\Vert\Xi\Vert^{1/2}(\frac{p}{n}+\frac{\log(1/\delta)}{n}+\epsilon)^{1/2} and breakdown-point ϵ1L4κ2\epsilon\lesssim\frac{1}{L^4\kappa^2}, both optimal in the 2\ell_2-norm, assuming the near-optimal minimum sample size L4κ2(plogp+log(1/δ))nL^4\kappa^2(p\log p + \log(1/\delta))\lesssim n, up to a log factor. To the best of our knowledge, this is the first computationally tractable algorithm satisfying simultaneously all the mentioned properties. Our estimator is based on a two-stage Multiplicative Weight Update algorithm. The first stage estimates a descent direction v^\hat v with respect to the (unknown) pre-conditioned inner product Σ(),\langle\Sigma(\cdot),\cdot\rangle. The second stage estimate the descent direction Σv^\Sigma\hat v with respect to the (known) inner product ,\langle\cdot,\cdot\rangle, without knowing nor estimating Σ\Sigma.

View on arXiv
Comments on this paper