ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1809.08587
24
45

Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks

23 September 2018
Ohad Shamir
ArXivPDFHTML
Abstract

We study the dynamics of gradient descent on objective functions of the form f(∏i=1kwi)f(\prod_{i=1}^{k} w_i)f(∏i=1k​wi​) (with respect to scalar parameters w1,…,wkw_1,\ldots,w_kw1​,…,wk​), which arise in the context of training depth-kkk linear neural networks. We prove that for standard random initializations, and under mild assumptions on fff, the number of iterations required for convergence scales exponentially with the depth kkk. We also show empirically that this phenomenon can occur in higher dimensions, where each wiw_iwi​ is a matrix. This highlights a potential obstacle in understanding the convergence of gradient-based methods for deep linear neural networks, where kkk is large.

View on arXiv
Comments on this paper