Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization

Bilevel optimization has various applications such as hyper-parameter optimization and meta-learning. Designing theoretically efficient algorithms for bilevel optimization is more challenging than standard optimization because the lower-level problem defines the feasibility set implicitly via another optimization problem. One tractable case is when the lower-level problem permits strong convexity. Recent works show that second-order methods can provably converge to an -first-order stationary point of the problem at a rate of , yet these algorithms require a Hessian-vector product oracle. Kwon et al. (2023) resolved the problem by proposing a first-order method that can achieve the same goal at a slower rate of . In this work, we provide an improved analysis demonstrating that the first-order method can also find an -first-order stationary point within oracle complexity, which matches the upper bounds for second-order methods in the dependency on . Our analysis further leads to simple first-order algorithms that can achieve similar near-optimal rates in finding second-order stationary points and in distributed bilevel problems.
View on arXiv@article{chen2025_2306.14853, title={ Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles }, author={ Lesi Chen and Yaohua Ma and Jingzhao Zhang }, journal={arXiv preprint arXiv:2306.14853}, year={ 2025 } }