Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization
Training, Symmetry, and Sparsity

Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity

30 June 2021

Berfin cSimcsek

Clément Hongler

Papers citing "Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity"

13 / 13 papers shown

Title
On the Cone Effect in the Learning Dynamics Zhanpeng Zhou Yongyi Yang Jie Ren Mahito Sugiyama Junchi Yan 53 0 0 20 Mar 2025
The Optimization Landscape of SGD Across the Feature Learning Strength Alexander B. Atanasov Alexandru Meterez James B. Simon Cengiz Pehlevan 43 2 0 06 Oct 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks Clémentine Dominé Nicolas Anguita A. Proca Lukas Braun D. Kunin P. Mediano Andrew M. Saxe 38 3 0 22 Sep 2024
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion Zhiwei Bai Jiajie Zhao Yaoyu Zhang AI4CE 37 0 0 22 May 2024
Early Directional Convergence in Deep Homogeneous Neural Networks for Small Initializations Akshay Kumar Jarvis Haupt ODL 44 3 0 12 Mar 2024
Saddle-to-Saddle Dynamics in Diagonal Linear Networks Scott Pesme Nicolas Flammarion 31 35 0 02 Apr 2023
On the Stepwise Nature of Self-Supervised Learning James B. Simon Maksis Knutins Liu Ziyin Daniel Geisz Abraham J. Fetterman Joshua Albrecht SSL 37 30 0 27 Mar 2023
Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent Liu Ziyin Botao Li Tomer Galanti Masakuni Ueda 37 7 0 23 Mar 2023
Generalization on the Unseen, Logic Reasoning and Degree Curriculum Emmanuel Abbe Samy Bengio Aryo Lotfi Kevin Rizk LRM 39 49 0 30 Jan 2023
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 45 56 0 11 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions Arthur Jacot 36 25 0 29 Sep 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 24 58 0 02 Jun 2022