Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.00488
Cited By
Saddle-to-Saddle Dynamics in Diagonal Linear Networks
2 April 2023
Scott Pesme
Nicolas Flammarion
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Saddle-to-Saddle Dynamics in Diagonal Linear Networks"
25 / 25 papers shown
Title
Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization
Ziqing Xu
Hancheng Min
Lachlan Ewen MacDonald
Jinqi Luo
Salma Tarmoun
Enrique Mallada
Rene Vidal
AI4CE
56
0
0
10 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
38
0
0
28 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
89
1
0
26 Feb 2025
Towards understanding epoch-wise double descent in two-layer linear neural networks
Amanda Olmin
Fredrik Lindsten
MLT
29
3
0
13 Jul 2024
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks
Etai Littwin
Omid Saremi
Madhu Advani
Vimal Thilak
Preetum Nakkiran
Chen Huang
Joshua Susskind
46
3
0
03 Jul 2024
Implicit Bias of Mirror Flow on Separable Data
Scott Pesme
Radu-Alexandru Dragomir
Nicolas Flammarion
39
1
0
18 Jun 2024
Improving Generalization and Convergence by Enhancing Implicit Regularization
Mingze Wang
Haotian He
Jinbo Wang
Zilin Wang
Guanhua Huang
Feiyu Xiong
Zhiyu Li
E. Weinan
Lei Wu
45
7
0
31 May 2024
Synchronization on circles and spheres with nonlinear interactions
Christopher Criscitiello
Quentin Rebjock
Andrew D. McRae
Nicolas Boumal
31
4
0
28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
38
8
0
27 May 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention
Heejune Sheen
Siyu Chen
Tianhao Wang
Harrison H. Zhou
MLT
46
10
0
13 Mar 2024
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks
Akshay Kumar
Jarvis Haupt
ODL
30
7
0
14 Feb 2024
When Representations Align: Universality in Representation Learning Dynamics
Loek van Rossem
Andrew M. Saxe
AI4CE
48
4
0
14 Feb 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features
Rodrigo Veiga
Anastasia Remizova
Nicolas Macris
40
0
0
12 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
48
0
0
08 Feb 2024
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth
Kevin Kögler
A. Shevchenko
Hamed Hassani
Marco Mondelli
MLT
33
0
0
07 Feb 2024
Understanding Unimodal Bias in Multimodal Deep Linear Networks
Yedi Zhang
Peter E. Latham
Andrew Saxe
34
6
0
01 Dec 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults
Prin Phunyaphibarn
Junghyun Lee
Bohan Wang
Huishuai Zhang
Chulhee Yun
29
0
0
25 Nov 2023
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling
Mingze Wang
Zeping Min
Lei Wu
33
3
0
24 Nov 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem
Margalit Glasgow
MLT
82
13
0
26 Sep 2023
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks
J. S. Wind
Vegard Antun
A. Hansen
27
4
0
13 Jul 2023
Transformers learn through gradual rank increase
Enric Boix-Adserà
Etai Littwin
Emmanuel Abbe
Samy Bengio
J. Susskind
54
33
0
12 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs
D. Chistikov
Matthias Englert
R. Lazic
MLT
36
12
0
10 Jun 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
40
7
0
09 May 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
84
74
0
21 Feb 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability
Mathieu Even
Scott Pesme
Suriya Gunasekar
Nicolas Flammarion
28
16
0
17 Feb 2023
1