Convergence Rates for Stochastic Approximation: Biased Noise with Unbounded Variance, and Applications

The Stochastic Approximation (SA) algorithm introduced by Robbins and Monro in 1951 has been a standard method for solving equations of the form , when only noisy measurements of are available. If for some function , then SA can also be used to find a stationary point of . At each time , the current guess is updated to using a noisy measurement of the form . In much of the literature, it is assumed that the error term has zero conditional mean, and/or that its conditional variance is bounded as a function of (though not necessarily with respect to ). Over the years, SA has been applied to a variety of areas, out of which the focus in this paper is on convex and nonconvex optimization. As it turns out, in these applications, the above-mentioned assumptions on the measurement error do not always hold. In zero-order methods, the error neither has zero mean nor bounded conditional variance. In the present paper, we extend SA theory to encompass errors with nonzero conditional mean and/or unbounded conditional variance. In addition, we derive estimates for the rate of convergence of the algorithm, and compute the ``optimal step size sequences'' to maximize the estimated rate of convergence.
View on arXiv