Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.15933
Cited By
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
30 June 2021
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity"
13 / 13 papers shown
Title
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
53
0
0
20 Mar 2025
The Optimization Landscape of SGD Across the Feature Learning Strength
Alexander B. Atanasov
Alexandru Meterez
James B. Simon
Cengiz Pehlevan
43
2
0
06 Oct 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
38
3
0
22 Sep 2024
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion
Zhiwei Bai
Jiajie Zhao
Yaoyu Zhang
AI4CE
37
0
0
22 May 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations
Akshay Kumar
Jarvis Haupt
ODL
44
3
0
12 Mar 2024
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
Scott Pesme
Nicolas Flammarion
31
35
0
02 Apr 2023
On the Stepwise Nature of Self-Supervised Learning
James B. Simon
Maksis Knutins
Liu Ziyin
Daniel Geisz
Abraham J. Fetterman
Joshua Albrecht
SSL
37
30
0
27 Mar 2023
Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent
Liu Ziyin
Botao Li
Tomer Galanti
Masakuni Ueda
37
7
0
23 Mar 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum
Emmanuel Abbe
Samy Bengio
Aryo Lotfi
Kevin Rizk
LRM
39
49
0
30 Jan 2023
Infinite-width limit of deep linear neural networks
Lénaïc Chizat
Maria Colombo
Xavier Fernández-Real
Alessio Figalli
31
14
0
29 Nov 2022
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
45
56
0
11 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Arthur Jacot
36
25
0
29 Sep 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
24
58
0
02 Jun 2022
1