Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization

26 June 2023

Le‐Yu Chen

Yaohua Ma

J.N. Zhang

ArXiv (abs)PDF HTML

Main:48 Pages

3 Figures

Bibliography:8 Pages

3 Tables

Abstract

Bilevel optimization has various applications such as hyper-parameter optimization and meta-learning. Designing theoretically efficient algorithms for bilevel optimization is more challenging than standard optimization because the lower-level problem defines the feasibility set implicitly via another optimization problem. One tractable case is when the lower-level problem permits strong convexity. Recent works show that second-order methods can provably converge to an $\epsilon$ -first-order stationary point of the problem at a rate of $\tilde{\mathcal{O}}(\epsilon^{-2})$ , yet these algorithms require a Hessian-vector product oracle. Kwon et al. (2023) resolved the problem by proposing a first-order method that can achieve the same goal at a slower rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$ . In this work, we provide an improved analysis demonstrating that the first-order method can also find an $\epsilon$ -first-order stationary point within $\tilde {\mathcal{O}}(\epsilon^{-2})$ oracle complexity, which matches the upper bounds for second-order methods in the dependency on $\epsilon$ . Our analysis further leads to simple first-order algorithms that can achieve similar near-optimal rates in finding second-order stationary points and in distributed bilevel problems.

View on arXiv

@article{chen2025_2306.14853,
  title={ Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles },
  author={ Lesi Chen and Yaohua Ma and Jingzhao Zhang },
  journal={arXiv preprint arXiv:2306.14853},
  year={ 2025 }
}

Comments on this paper