149

On the Condition Number Dependency in Bilevel Optimization

Main:18 Pages
2 Figures
Bibliography:4 Pages
3 Tables
Appendix:9 Pages
Abstract

Bilevel optimization minimizes an objective function, defined by an upper-level problem whose feasible region is the solution of a lower-level problem. We study the oracle complexity of finding an ϵ\epsilon-stationary point with first-order methods when the upper-level problem is nonconvex and the lower-level problem is strongly convex. Recent works (Ji et al., ICML 2021; Arbel and Mairal, ICLR 2022; Chen el al., JMLR 2025) achieve a O~(κ4ϵ2)\tilde{\mathcal{O}}(\kappa^4 \epsilon^{-2}) upper bound that is near-optimal in ϵ\epsilon. However, the optimal dependency on the condition number κ\kappa is unknown. In this work, we establish a new Ω(κ2ϵ2)\Omega(\kappa^2 \epsilon^{-2}) lower bound and O~(κ7/2ϵ2)\tilde{\mathcal{O}}(\kappa^{7/2} \epsilon^{-2}) upper bound for this problem, establishing the first provable gap between bilevel problems and minimax problems in this setup. Our lower bounds can be extended to various settings, including high-order smooth functions, stochastic oracles, and convex hyper-objectives: (1) For second-order and arbitrarily smooth problems, we show Ω(κy13/4ϵ12/7)\Omega(\kappa_y^{13/4} \epsilon^{-12/7}) and Ω(κ17/10ϵ8/5)\Omega(\kappa^{17/10} \epsilon^{-8/5}) lower bounds, respectively. (2) For convex-strongly-convex problems, we improve the previously best lower bound (Ji and Liang, JMLR 2022) from Ω(κ/ϵ)\Omega(\kappa /\sqrt{\epsilon}) to Ω(κ5/4/ϵ)\Omega(\kappa^{5/4} / \sqrt{\epsilon}). (3) For smooth stochastic problems, we show an Ω(κ4ϵ4)\Omega(\kappa^4 \epsilon^{-4}) lower bound.

View on arXiv
Comments on this paper