308

Least-squares regressions via randomized Hessians

Abstract

We consider the least-squares regression problem with a finite number of points. We analyze a novel approach, based on randomizing the Hessian matrix, to approximately solve this problem. The new algorithm is a variant of the averaged stochastic gradient descent method (SGD) with constant step-size. However, its updating rule relies on the entire response vector, and its convergence properties do not depend on the residuals. Without strong convexity assumptions, it is proven that the algorithm achieves a convergence rate for function values of O(1/k)O(1/k) after kk iterations, where the constant behind the OO notation does not depend explicitly on the smallest eigenvalue of the Hessian matrix. The algorithm has a preprocessing cost proportional to the input size, and the running time of each iteration is proportional to the dimension. In the strongly-convex case, a restart version of the algorithm yields a convergence rate of O(kl)O(k^{-l}) in O(ld(n+k))O(ld(n+k)) time for arbitrary l2l\ge2, where the constant behind the OO notation depends on ll and on the smallest eigenvalue of the Hessian matrix. Our theoretical results are illustrated with numerical experiments.

View on arXiv
Comments on this paper