ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.13016
12
1

Escaping Saddle-Points Faster under Interpolation-like Conditions

28 September 2020
Abhishek Roy
Krishnakumar Balasubramanian
Saeed Ghadimi
P. Mohapatra
ArXivPDFHTML
Abstract

In this paper, we show that under over-parametrization several standard stochastic optimization algorithms escape saddle-points and converge to local-minimizers much faster. One of the fundamental aspects of over-parametrized models is that they are capable of interpolating the training data. We show that, under interpolation-like assumptions satisfied by the stochastic gradients in an over-parametrization setting, the first-order oracle complexity of Perturbed Stochastic Gradient Descent (PSGD) algorithm to reach an ϵ\epsilonϵ-local-minimizer, matches the corresponding deterministic rate of O~(1/ϵ2)\tilde{\mathcal{O}}(1/\epsilon^{2})O~(1/ϵ2). We next analyze Stochastic Cubic-Regularized Newton (SCRN) algorithm under interpolation-like conditions, and show that the oracle complexity to reach an ϵ\epsilonϵ-local-minimizer under interpolation-like conditions, is O~(1/ϵ2.5)\tilde{\mathcal{O}}(1/\epsilon^{2.5})O~(1/ϵ2.5). While this obtained complexity is better than the corresponding complexity of either PSGD, or SCRN without interpolation-like assumptions, it does not match the rate of O~(1/ϵ1.5)\tilde{\mathcal{O}}(1/\epsilon^{1.5})O~(1/ϵ1.5) corresponding to deterministic Cubic-Regularized Newton method. It seems further Hessian-based interpolation-like assumptions are necessary to bridge this gap. We also discuss the corresponding improved complexities in the zeroth-order settings.

View on arXiv
Comments on this paper