Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.18935
Cited By
Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
29 October 2023
Yiwen Kou
Zixiang Chen
Quanquan Gu
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data"
12 / 12 papers shown
Title
Symmetry in Neural Network Parameter Spaces
Bo Zhao
Robin Walters
Rose Yu
29
0
0
16 Jun 2025
The Rich and the Simple: On the Implicit Bias of Adam and SGD
Bhavya Vasudeva
Jung Whan Lee
Vatsal Sharan
Mahdi Soltanolkotabi
36
0
0
29 May 2025
Directional Convergence, Benign Overfitting of Gradient Descent in leaky ReLU two-layer Neural Networks
Ichiro Hashimoto
MLT
81
0
0
22 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
131
0
0
02 May 2025
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
Chenyang Zhang
Peifeng Gao
Difan Zou
Yuan Cao
OOD
MLT
161
0
0
11 Apr 2025
Non-asymptotic Convergence of Training Transformers for Next-token Prediction
Ruiquan Huang
Yingbin Liang
Jing Yang
92
7
0
25 Sep 2024
The Implicit Bias of Adam on Separable Data
Chenyang Zhang
Difan Zou
Yuan Cao
AI4CE
94
9
0
15 Jun 2024
Simplicity Bias of Two-Layer Networks beyond Linearly Separable Data
Nikita Tsoy
Nikola Konstantinov
77
4
0
27 May 2024
Transformers Learn Low Sensitivity Functions: Investigations and Implications
Bhavya Vasudeva
Deqing Fu
Tianyi Zhou
Elliott Kau
Youqi Huang
Vatsal Sharan
113
2
0
11 Mar 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
117
21
0
08 Feb 2024
From Stability to Chaos: Analyzing Gradient Descent Dynamics in Quadratic Regression
Xuxing Chen
Krishnakumar Balasubramanian
Promit Ghosal
Bhavya Agrawalla
72
8
0
02 Oct 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
92
23
0
24 Jul 2023
1