Towards understanding epoch-wise double descent in two-layer linear neural networks

13 July 2024

Papers citing "Towards understanding epoch-wise double descent in two-layer linear neural networks"

17 / 17 papers shown

Title
Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the role of model complexity Mouin Ben Ammar David Brellmann Arturo Mendoza Antoine Manzanera Gianni Franchi OODD 81 0 0 04 Nov 2024
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning D. Kunin Allan Raventós Clémentine Dominé Feng Chen David Klindt Andrew M. Saxe Surya Ganguli MLT 87 18 0 10 Jun 2024
A U-turn on Double Descent: Rethinking Parameter Counting in Statistical Learning Alicia Curth Alan Jeffares M. Schaar 22 21 0 29 Oct 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks Scott Pesme Nicolas Flammarion 78 39 0 02 Apr 2023
Implicit Regularization for Group Sparsity Jiangyuan Li THANH VAN NGUYEN Chinmay Hegde Raymond K. W. Wong 60 9 0 29 Jan 2023
Neural Networks as Kernel Learners: The Silent Alignment Effect Alexander B. Atanasov Blake Bordelon Cengiz Pehlevan MLT 60 82 0 29 Oct 2021
Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model A. Bodin N. Macris 87 13 0 22 Oct 2021
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications Ziqiao Wang Yongyi Mao FedML MLT 71 26 0 07 Oct 2021
Implicit Sparse Regularization: The Impact of Depth and Early Stopping Jiangyuan Li Thanh V. Nguyen Chinmay Hegde R. K. Wong 46 29 0 12 Aug 2021
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity Scott Pesme Loucas Pillaud-Vivien Nicolas Flammarion 49 106 0 17 Jun 2021
Early Stopping in Deep Networks: Double Descent and How to Eliminate it Reinhard Heckel Fatih Yilmaz 51 45 0 20 Jul 2020
The Implicit Bias of Depth: How Incremental Learning Drives Generalization Daniel Gissin Shai Shalev-Shwartz Amit Daniely AI4CE 70 81 0 26 Sep 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation Trevor Hastie Andrea Montanari Saharon Rosset Robert Tibshirani 159 743 0 19 Mar 2019
Reconciling modern machine learning practice and the bias-variance trade-off M. Belkin Daniel J. Hsu Siyuan Ma Soumik Mandal 201 1,638 0 28 Dec 2018
A mathematical theory of semantic development in deep neural networks Andrew M. Saxe James L. McClelland Surya Ganguli 73 270 0 23 Oct 2018
An analytic theory of generalization dynamics and transfer learning in deep linear networks Andrew Kyle Lampinen Surya Ganguli OOD 62 131 0 27 Sep 2018
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 162 1,844 0 20 Dec 2013