74
78

A Fully First-Order Method for Stochastic Bilevel Optimization

Abstract

We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend to require possibly expensive calculations regarding Hessians of lower-level objectives, or lack rigorous finite-time performance guarantees. In this work, we propose a Fully First-order Stochastic Approximation (F2SA) method, and study its non-asymptotic convergence properties. Specifically, we show that F2SA converges to an ϵ\epsilon-stationary solution of the bilevel problem after ϵ7/2,ϵ5/2\epsilon^{-7/2}, \epsilon^{-5/2}, and ϵ3/2\epsilon^{-3/2} iterations (each iteration using O(1)O(1) samples) when stochastic noises are in both level objectives, only in the upper-level objective, and not present (deterministic settings), respectively. We further show that if we employ momentum-assisted gradient estimators, the iteration complexities can be improved to ϵ5/2,ϵ4/2\epsilon^{-5/2}, \epsilon^{-4/2}, and ϵ3/2\epsilon^{-3/2}, respectively. We demonstrate even superior practical performance of the proposed method over existing second-order based approaches on MNIST data-hypercleaning experiments.

View on arXiv
Comments on this paper

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.