Saddle-to-Saddle Dynamics in Diagonal Linear Networks

2 April 2023

Papers citing "Saddle-to-Saddle Dynamics in Diagonal Linear Networks"

25 / 25 papers shown

Title
Understanding the Learning Dynamics of LoRA: A Gradient Flow Perspective on Low-Rank Adaptation in Matrix Factorization Ziqing Xu Hancheng Min Lachlan Ewen MacDonald Jinqi Luo Salma Tarmoun Enrique Mallada Rene Vidal AI4CE 56 0 0 10 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking) Yoonsoo Nam Seok Hyeong Lee Clementine Domine Yea Chan Park Charles London Wonyl Choi Niclas Goring Seungjai Lee AI4CE 38 0 0 28 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training Jinbo Wang Mingze Wang Zhanpeng Zhou Junchi Yan Weinan E Lei Wu 89 1 0 26 Feb 2025
Towards understanding epoch-wise double descent in two-layer linear neural networks Amanda Olmin Fredrik Lindsten MLT 29 3 0 13 Jul 2024
How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks Etai Littwin Omid Saremi Madhu Advani Vimal Thilak Preetum Nakkiran Chen Huang Joshua Susskind 46 3 0 03 Jul 2024
Implicit Bias of Mirror Flow on Separable Data Scott Pesme Radu-Alexandru Dragomir Nicolas Flammarion 39 1 0 18 Jun 2024
Improving Generalization and Convergence by Enhancing Implicit Regularization Mingze Wang Haotian He Jinbo Wang Zilin Wang Guanhua Huang Feiyu Xiong Zhiyu Li E. Weinan Lei Wu 45 7 0 31 May 2024
Synchronization on circles and spheres with nonlinear interactions Christopher Criscitiello Quentin Rebjock Andrew D. McRae Nicolas Boumal 31 4 0 28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes Zhenfeng Tu Santiago Aranguri Arthur Jacot 38 8 0 27 May 2024
Implicit Regularization of Gradient Flow on One-Layer Softmax Attention Heejune Sheen Siyu Chen Tianhao Wang Harrison H. Zhou MLT 46 10 0 13 Mar 2024
Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks Akshay Kumar Jarvis Haupt ODL 30 7 0 14 Feb 2024
When Representations Align: Universality in Representation Learning Dynamics Loek van Rossem Andrew M. Saxe AI4CE 48 4 0 14 Feb 2024
Stochastic Gradient Flow Dynamics of Test Risk and its Exact Solution for Weak Features Rodrigo Veiga Anastasia Remizova Nicolas Macris 40 0 0 12 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 48 0 0 08 Feb 2024
Compression of Structured Data with Autoencoders: Provable Benefit of Nonlinearities and Depth Kevin Kögler A. Shevchenko Hamed Hassani Marco Mondelli MLT 33 0 0 07 Feb 2024
Understanding Unimodal Bias in Multimodal Deep Linear Networks Yedi Zhang Peter E. Latham Andrew Saxe 34 6 0 01 Dec 2023
Gradient Descent with Polyak's Momentum Finds Flatter Minima via Large Catapults Prin Phunyaphibarn Junghyun Lee Bohan Wang Huishuai Zhang Chulhee Yun 29 0 0 25 Nov 2023
Achieving Margin Maximization Exponentially Fast via Progressive Norm Rescaling Mingze Wang Zeping Min Lei Wu 33 3 0 24 Nov 2023
SGD Finds then Tunes Features in Two-Layer Neural Networks with near-Optimal Sample Complexity: A Case Study in the XOR problem Margalit Glasgow MLT 82 13 0 26 Sep 2023
Implicit regularization in AI meets generalized hardness of approximation in optimization -- Sharp results for diagonal linear networks J. S. Wind Vegard Antun A. Hansen 27 4 0 13 Jul 2023
Transformers learn through gradual rank increase Enric Boix-Adserà Etai Littwin Emmanuel Abbe Samy Bengio J. Susskind 54 33 0 12 Jun 2023
Learning a Neuron by a Shallow ReLU Network: Dynamics and Implicit Bias for Correlated Inputs D. Chistikov Matthias Englert R. Lazic MLT 36 12 0 10 Jun 2023
Robust Implicit Regularization via Weight Normalization H. Chou Holger Rauhut Rachel A. Ward 40 7 0 09 May 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics Emmanuel Abbe Enric Boix-Adserà Theodor Misiakiewicz FedML MLT 84 74 0 21 Feb 2023
(S)GD over Diagonal Linear Networks: Implicit Regularisation, Large Stepsizes and Edge of Stability Mathieu Even Scott Pesme Suriya Gunasekar Nicolas Flammarion 28 16 0 17 Feb 2023