A spectral least-squares-type method for heavy-tailed corrupted regression with unknown covariance \& heterogeneous noise

We revisit heavy-tailed corrupted least-squares linear regression assuming to have a corrupted -sized label-feature sample of at most arbitrary outliers. We wish to estimate a -dimensional parameter given such sample of a label-feature pair satisfying with heavy-tailed . We only assume is hypercontractive with constant and has covariance matrix with minimum eigenvalue and bounded condition number . The noise can be arbitrarily dependent on and nonsymmetric as long as has finite covariance matrix . We propose a near-optimal computationally tractable estimator, based on the power method, assuming no knowledge on nor the operator norm of . With probability at least , our proposed estimator attains the statistical rate and breakdown-point , both optimal in the -norm, assuming the near-optimal minimum sample size , up to a log factor. To the best of our knowledge, this is the first computationally tractable algorithm satisfying simultaneously all the mentioned properties. Our estimator is based on a two-stage Multiplicative Weight Update algorithm. The first stage estimates a descent direction with respect to the (unknown) pre-conditioned inner product . The second stage estimate the descent direction with respect to the (known) inner product , without knowing nor estimating .
View on arXiv