Least-squares regressions via randomized Hessians

1 June 2020

Abstract

We consider the least-squares regression problem with a finite number of points. We analyze a novel approach, based on randomizing the Hessian matrix, to approximately solve this problem. The new algorithm is a variant of the averaged stochastic gradient descent method (SGD) with constant step-size. However, its updating rule relies on the entire response vector, and its convergence properties do not depend on the residuals. Without strong convexity assumptions, it is proven that the algorithm achieves a convergence rate for function values of $O(1/k)$ after $k$ iterations, where the constant behind the $O$ notation does not depend explicitly on the smallest eigenvalue of the Hessian matrix. The algorithm has a preprocessing cost proportional to the input size, and the running time of each iteration is proportional to the dimension. In the strongly-convex case, a restart version of the algorithm yields a convergence rate of $O(k^{-l})$ in $O(ld(n+k))$ time for arbitrary $l\ge2$ , where the constant behind the $O$ notation depends on $l$ and on the smallest eigenvalue of the Hessian matrix. Our theoretical results are illustrated with numerical experiments.

View on arXiv

Comments on this paper