$Implicit bias of SGD in $L_{2}$-regularized linear DNNs: One-way jumps from high to low rank$

Implicit bias of SGD in $L_{2}$ -regularized linear DNNs: One-way jumps from high to low rank

25 May 2023

Papers citing "Implicit bias of SGD in $L_{2}$-regularized linear DNNs: One-way jumps from high to low rank"

8 / 8 papers shown

Title
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) Yoonsoo Nam Seok Hyeong Lee Clementine Domine Yea Chan Park Charles London Wonyl Choi Niclas Goring Seungjai Lee AI4CE 127 0 0 28 Feb 2025
$Feature Learning in $L_{2}$-regularized DNNs: Attraction/Repulsion and Sparsity$ Feature Learning in $L_{2}$ -regularized DNNs: Attraction/Repulsion and Sparsity Arthur Jacot Eugene Golikov Clément Hongler Franck Gabriel MLT 56 17 0 31 May 2022
Implicit Bias of SGD for Diagonal Linear Networks: a Provable Benefit of Stochasticity Scott Pesme Loucas Pillaud-Vivien Nicolas Flammarion 47 100 0 17 Jun 2021
Implicit Regularization in Deep Matrix Factorization Sanjeev Arora Nadav Cohen Wei Hu Yuping Luo AI4CE 68 500 0 31 May 2019
Three Factors Influencing Minima in SGD Stanislaw Jastrzebski Zachary Kenton Devansh Arpit Nicolas Ballas Asja Fischer Yoshua Bengio Amos Storkey 76 459 0 13 Nov 2017
Stochastic Gradient Descent as Approximate Bayesian Inference Stephan Mandt Matthew D. Hoffman David M. Blei BDL 52 594 0 13 Apr 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 384 2,922 0 15 Sep 2016
Matrix Completion from a Few Entries Raghunandan H. Keshavan Andrea Montanari Sewoong Oh 326 1,242 0 20 Jan 2009