ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.07536
82
8

Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem

15 August 2023
Jincheng Cao
Ruichen Jiang
Nazanin Abolfazli
E. Y. Hamedani
Aryan Mokhtari
ArXivPDFHTML
Abstract

In this paper, we study a class of stochastic bilevel optimization problems, also known as stochastic simple bilevel optimization, where we minimize a smooth stochastic objective function over the optimal solution set of another stochastic convex optimization problem. We introduce novel stochastic bilevel optimization methods that locally approximate the solution set of the lower-level problem via a stochastic cutting plane, and then run a conditional gradient update with variance reduction techniques to control the error induced by using stochastic gradients. For the case that the upper-level function is convex, our method requires O~(max⁡{1/ϵf2,1/ϵg2})\tilde{\mathcal{O}}(\max\{1/\epsilon_f^{2},1/\epsilon_g^{2}\}) O~(max{1/ϵf2​,1/ϵg2​}) stochastic oracle queries to obtain a solution that is ϵf\epsilon_fϵf​-optimal for the upper-level and ϵg\epsilon_gϵg​-optimal for the lower-level. This guarantee improves the previous best-known complexity of O(max⁡{1/ϵf4,1/ϵg4})\mathcal{O}(\max\{1/\epsilon_f^{4},1/\epsilon_g^{4}\})O(max{1/ϵf4​,1/ϵg4​}). Moreover, for the case that the upper-level function is non-convex, our method requires at most O~(max⁡{1/ϵf3,1/ϵg3})\tilde{\mathcal{O}}(\max\{1/\epsilon_f^{3},1/\epsilon_g^{3}\}) O~(max{1/ϵf3​,1/ϵg3​}) stochastic oracle queries to find an (ϵf,ϵg)(\epsilon_f, \epsilon_g)(ϵf​,ϵg​)-stationary point. In the finite-sum setting, we show that the number of stochastic oracle calls required by our method are O~(n/ϵ)\tilde{\mathcal{O}}(\sqrt{n}/\epsilon)O~(n​/ϵ) and O~(n/ϵ2)\tilde{\mathcal{O}}(\sqrt{n}/\epsilon^{2})O~(n​/ϵ2) for the convex and non-convex settings, respectively, where ϵ=min⁡{ϵf,ϵg}\epsilon=\min \{\epsilon_f,\epsilon_g\}ϵ=min{ϵf​,ϵg​}.

View on arXiv
Comments on this paper