Gradient descent aligns the layers of deep linear networks

4 October 2018

Papers citing "Gradient descent aligns the layers of deep linear networks"

50 / 61 papers shown

Title
Embedding principle of homogeneous neural network for classification problem Jiahan Zhang Tao Luo Yaoyu Zhang 2 0 0 18 May 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks Chenyang Zhang Peifeng Gao Difan Zou Yuan Cao OOD MLT 59 0 0 11 Apr 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks Sholom Schechtman Nicolas Schreuder 173 0 0 08 Feb 2025
Grokking at the Edge of Numerical Stability Lucas Prieto Melih Barsbey Pedro A.M. Mediano Tolga Birdal 48 3 0 08 Jan 2025
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation Can Yaras Peng Wang Laura Balzano Qing Qu AI4CE 37 12 0 06 Jun 2024
Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion Zhiwei Bai Jiajie Zhao Yaoyu Zhang AI4CE 37 0 0 22 May 2024
$Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization$ Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization Shuo Xie Zhiyuan Li OffRL 47 13 0 05 Apr 2024
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks Yuan Cao Difan Zou Yuan-Fang Li Quanquan Gu MLT 37 5 0 20 Jun 2023
ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models Suzanna Parkinson Greg Ongie Rebecca Willett 65 6 0 24 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability Jingfeng Wu Vladimir Braverman Jason D. Lee 32 17 0 19 May 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks Zheng Chen Yuqing Li Tao Luo Zhaoguang Zhou Z. Xu MLT AI4CE 49 8 0 12 Mar 2023
Provable Pathways: Learning Multiple Tasks over Multiple Paths Yingcong Li Samet Oymak MoE 26 4 0 08 Mar 2023
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features Lawrence Stewart Francis R. Bach Quentin Berthet Jean-Philippe Vert 29 24 0 10 Nov 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models Hong Liu Sang Michael Xie Zhiyuan Li Tengyu Ma AI4CE 40 49 0 25 Oct 2022
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions Arthur Jacot 36 25 0 29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons Sangmin Lee Byeongsu Sim Jong Chul Ye MLT 96 6 0 27 Sep 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do Niladri S. Chatterji Philip M. Long 23 8 0 19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 34 72 0 26 Aug 2022
The Neural Race Reduction: Dynamics of Abstraction in Gated Networks Andrew M. Saxe Shagun Sodhani Sam Lewallen AI4CE 30 34 0 21 Jul 2022
Reconstructing Training Data from Trained Neural Networks Niv Haim Gal Vardi Gilad Yehudai Ohad Shamir Michal Irani 40 132 0 15 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli 25 71 0 08 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 24 58 0 02 Jun 2022
Convergence of gradient descent for deep neural networks S. Chatterjee ODL 21 20 0 30 Mar 2022
Explicitising The Implicit Intrepretability of Deep Neural Networks Via Duality Chandrashekar Lakshminarayanan Ashutosh Kumar Singh A. Rajkumar AI4CE 26 1 0 01 Mar 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks Noam Razin Asaf Maman Nadav Cohen 46 29 0 27 Jan 2022
Benign Overfitting in Adversarially Robust Linear Classification Jinghui Chen Yuan Cao Quanquan Gu AAML SILM 34 10 0 31 Dec 2021
Variational autoencoders in the presence of low-dimensional data: landscape and implicit bias Frederic Koehler Viraj Mehta Chenghui Zhou Andrej Risteski DRL 36 12 0 13 Dec 2021
Understanding Dimensional Collapse in Contrastive Self-supervised Learning Li Jing Pascal Vincent Yann LeCun Yuandong Tian SSL 25 338 0 18 Oct 2021
Parallel Deep Neural Networks Have Zero Duality Gap Yifei Wang Tolga Ergen Mert Pilanci 79 10 0 13 Oct 2021
Self-supervised Learning is More Robust to Dataset Imbalance Hong Liu Jeff Z. HaoChen Adrien Gaidon Tengyu Ma OOD SSL 33 157 0 11 Oct 2021
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect Yuqing Wang Minshuo Chen T. Zhao Molei Tao AI4CE 57 40 0 07 Oct 2021
On Margin Maximization in Linear and ReLU Networks Gal Vardi Ohad Shamir Nathan Srebro 50 28 0 06 Oct 2021
Interpolation can hurt robust generalization even when there is no noise Konstantin Donhauser Alexandru cTifrea Michael Aerni Reinhard Heckel Fanny Yang 34 14 0 05 Aug 2021
A Theoretical Analysis of Fine-tuning with Linear Teachers Gal Shachaf Alon Brutzkus Amir Globerson 34 17 0 04 Jul 2021
Understanding the role of importance weighting for deep learning Da Xu Yuting Ye Chuanwei Ruan FAtt 39 43 0 28 Mar 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization Tianyi Liu Yan Li S. Wei Enlu Zhou T. Zhao 21 13 0 24 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Shahar Azulay E. Moroshko Mor Shpigel Nacson Blake E. Woodworth Nathan Srebro Amir Globerson Daniel Soudry AI4CE 33 73 0 19 Feb 2021
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks Bohan Wang Qi Meng Wei Chen Tie-Yan Liu 30 33 0 11 Dec 2020
Align, then memorise: the dynamics of learning with feedback alignment Maria Refinetti Stéphane dÁscoli Ruben Ohana Sebastian Goldt 26 36 0 24 Nov 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth J. Lee Nathan Srebro Daniel Soudry 35 85 0 13 Jul 2020
When Does Preconditioning Help or Hurt Generalization? S. Amari Jimmy Ba Roger C. Grosse Xuechen Li Atsushi Nitanda Taiji Suzuki Denny Wu Ji Xu 36 32 0 18 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance Jeff Z. HaoChen Colin Wei J. Lee Tengyu Ma 29 93 0 15 Jun 2020
To Each Optimizer a Norm, To Each Norm its Generalization Sharan Vaswani Reza Babanezhad Jose Gallego Aaron Mishkin Simon Lacoste-Julien Nicolas Le Roux 26 8 0 11 Jun 2020
Directional convergence and alignment in deep learning Ziwei Ji Matus Telgarsky 17 162 0 11 Jun 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms Noam Razin Nadav Cohen 24 155 0 13 May 2020
An Optimization and Generalization Analysis for Max-Pooling Networks Alon Brutzkus Amir Globerson MLT AI4CE 16 4 0 22 Feb 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss Lénaïc Chizat Francis R. Bach MLT 39 327 0 11 Feb 2020
Optimization for deep learning: theory and algorithms Ruoyu Sun ODL 19 168 0 19 Dec 2019
Global Convergence of Gradient Descent for Deep Linear Residual Networks Lei Wu Qingcan Wang Chao Ma ODL AI4CE 28 22 0 02 Nov 2019