ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.06720
38
1

Gradient Descent is Pareto-Optimal in the Oracle Complexity and Memory Tradeoff for Feasibility Problems

10 April 2024
Moise Blanchard
ArXivPDFHTML
Abstract

In this paper we provide oracle complexity lower bounds for finding a point in a given set using a memory-constrained algorithm that has access to a separation oracle. We assume that the set is contained within the unit ddd-dimensional ball and contains a ball of known radius ϵ>0\epsilon>0ϵ>0. This setup is commonly referred to as the feasibility problem. We show that to solve feasibility problems with accuracy ϵ≥e−do(1)\epsilon \geq e^{-d^{o(1)}}ϵ≥e−do(1), any deterministic algorithm either uses d1+δd^{1+\delta}d1+δ bits of memory or must make at least 1/(d0.01δϵ21−δ1+1.01δ−o(1))1/(d^{0.01\delta }\epsilon^{2\frac{1-\delta}{1+1.01 \delta}-o(1)})1/(d0.01δϵ21+1.01δ1−δ​−o(1)) oracle queries, for any δ∈[0,1]\delta\in[0,1]δ∈[0,1]. Additionally, we show that randomized algorithms either use d1+δd^{1+\delta}d1+δ memory or make at least 1/(d2δϵ2(1−4δ)−o(1))1/(d^{2\delta} \epsilon^{2(1-4\delta)-o(1)})1/(d2δϵ2(1−4δ)−o(1)) queries for any δ∈[0,14]\delta\in[0,\frac{1}{4}]δ∈[0,41​]. Because gradient descent only uses linear memory O(dln⁡1/ϵ)\mathcal O(d\ln 1/\epsilon)O(dln1/ϵ) but makes Ω(1/ϵ2)\Omega(1/\epsilon^2)Ω(1/ϵ2) queries, our results imply that it is Pareto-optimal in the oracle complexity/memory tradeoff. Further, our results show that the oracle complexity for deterministic algorithms is always polynomial in 1/ϵ1/\epsilon1/ϵ if the algorithm has less than quadratic memory in ddd. This reveals a sharp phase transition since with quadratic O(d2ln⁡1/ϵ)\mathcal O(d^2 \ln1/\epsilon)O(d2ln1/ϵ) memory, cutting plane methods only require O(dln⁡1/ϵ)\mathcal O(d\ln 1/\epsilon)O(dln1/ϵ) queries.

View on arXiv
Comments on this paper