ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21722
46
0

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

27 May 2025
Ioannis Bantzis
James B. Simon
Arthur Jacot
    ODL
ArXiv (abs)PDFHTML
Main:8 Pages
8 Figures
Bibliography:4 Pages
Appendix:19 Pages
Abstract

When a deep ReLU network is initialized with small weights, GD is at first dominated by the saddle at the origin in parameter space. We study the so-called escape directions, which play a similar role as the eigenvectors of the Hessian for strict saddles. We show that the optimal escape direction features a low-rank bias in its deeper layers: the first singular value of the ℓ\ellℓ-th layer weight matrix is at least ℓ14\ell^{\frac{1}{4}}ℓ41​ larger than any other singular value. We also prove a number of related results about these escape directions. We argue that this result is a first step in proving Saddle-to-Saddle dynamics in deep ReLU networks, where GD visits a sequence of saddles with increasing bottleneck rank.

View on arXiv
@article{bantzis2025_2505.21722,
  title={ Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape },
  author={ Ioannis Bantzis and James B. Simon and Arthur Jacot },
  journal={arXiv preprint arXiv:2505.21722},
  year={ 2025 }
}
Comments on this paper