17
15

Faster Gradient-Free Algorithms for Nonsmooth Nonconvex Stochastic Optimization

Abstract

We consider the optimization problem of the form minxRdf(x)Eξ[F(x;ξ)]\min_{x \in \mathbb{R}^d} f(x) \triangleq \mathbb{E}_{\xi} [F(x; \xi)], where the component F(x;ξ)F(x;\xi) is LL-mean-squared Lipschitz but possibly nonconvex and nonsmooth. The recently proposed gradient-free method requires at most O(L4d3/2ϵ4+ΔL3d3/2δ1ϵ4)\mathcal{O}( L^4 d^{3/2} \epsilon^{-4} + \Delta L^3 d^{3/2} \delta^{-1} \epsilon^{-4}) stochastic zeroth-order oracle complexity to find a (δ,ϵ)(\delta,\epsilon)-Goldstein stationary point of objective function, where Δ=f(x0)infxRdf(x)\Delta = f(x_0) - \inf_{x \in \mathbb{R}^d} f(x) and x0x_0 is the initial point of the algorithm. This paper proposes a more efficient algorithm using stochastic recursive gradient estimators, which improves the complexity to O(L3d3/2ϵ3+ΔL2d3/2δ1ϵ3)\mathcal{O}(L^3 d^{3/2} \epsilon^{-3}+ \Delta L^2 d^{3/2} \delta^{-1} \epsilon^{-3}).

View on arXiv
Comments on this paper