Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1803.01905
Cited By
Convergence of Gradient Descent on Separable Data
5 March 2018
Mor Shpigel Nacson
Jason D. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Convergence of Gradient Descent on Separable Data"
50 / 53 papers shown
Title
Embedding principle of homogeneous neural network for classification problem
Jiahan Zhang
Yaoyu Zhang
Tao Luo
17
0
0
18 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
60
0
0
02 May 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
33
0
0
05 Apr 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
70
9
0
20 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
236
0
0
08 Feb 2025
Implicit Bias of AdamW:
ℓ
∞
\ell_\infty
ℓ
∞
Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
50
13
0
05 Apr 2024
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Frederik Kunstner
Robin Yadav
Alan Milligan
Mark Schmidt
Alberto Bietti
46
26
0
29 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
39
15
0
08 Feb 2024
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Kyriakos Axiotis
M. Sviridenko
MLT
10
3
0
26 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
37
5
0
20 Jun 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
32
17
0
19 May 2023
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai
Vidya Muthukumar
34
5
0
13 Mar 2023
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
40
16
0
18 Feb 2023
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
27
3
0
19 Jan 2023
Iterative regularization in classification via hinge loss diagonal descent
Vassilis Apidopoulos
T. Poggio
Lorenzo Rosasco
S. Villa
29
2
0
24 Dec 2022
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
34
45
0
15 Nov 2022
Importance Tempering: Group Robustness for Overparameterized Models
Yiping Lu
Wenlong Ji
Zachary Izzo
Lexing Ying
50
7
0
19 Sep 2022
On Generalization of Decentralized Learning with Separable Data
Hossein Taheri
Christos Thrampoulidis
FedML
42
11
0
15 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
36
73
0
26 Aug 2022
Imbalance Trouble: Revisiting Neural-Collapse Geometry
Christos Thrampoulidis
Ganesh Ramachandra Kini
V. Vakilian
Tina Behnia
35
70
0
10 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
45
28
0
08 Jul 2022
Stability vs Implicit Bias of Gradient Methods on Separable Data and Beyond
Matan Schliserman
Tomer Koren
24
23
0
27 Feb 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
49
29
0
27 Jan 2022
On generalization bounds for deep networks based on loss surface implicit regularization
Masaaki Imaizumi
Johannes Schmidt-Hieber
ODL
34
3
0
12 Jan 2022
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
52
28
0
06 Oct 2021
On Large-Cohort Training for Federated Learning
Zachary B. Charles
Zachary Garrett
Zhouyuan Huo
Sergei Shmulyian
Virginia Smith
FedML
21
113
0
15 Jun 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
40
197
0
06 May 2021
The Low-Rank Simplicity Bias in Deep Networks
Minyoung Huh
H. Mobahi
Richard Y. Zhang
Brian Cheung
Pulkit Agrawal
Phillip Isola
35
110
0
18 Mar 2021
Label-Imbalanced and Group-Sensitive Classification under Overparameterization
Ganesh Ramachandra Kini
Orestis Paraskevas
Samet Oymak
Christos Thrampoulidis
33
93
0
02 Mar 2021
Dissecting Supervised Contrastive Learning
Florian Graf
Christoph Hofer
Marc Niethammer
Roland Kwitt
SSL
117
70
0
17 Feb 2021
Connecting Interpretability and Robustness in Decision Trees through Separation
Michal Moshkovitz
Yao-Yuan Yang
Kamalika Chaudhuri
33
22
0
14 Feb 2021
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
30
33
0
11 Dec 2020
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
18
18
0
04 Nov 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
23
80
0
06 Oct 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
Jason D. Lee
Nathan Srebro
Daniel Soudry
35
85
0
13 Jul 2020
An analytic theory of shallow networks dynamics for hinge loss classification
Franco Pellegrini
Giulio Biroli
35
19
0
19 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
32
94
0
15 Jun 2020
To Each Optimizer a Norm, To Each Norm its Generalization
Sharan Vaswani
Reza Babanezhad
Jose Gallego
Aaron Mishkin
Simon Lacoste-Julien
Nicolas Le Roux
26
8
0
11 Jun 2020
Classification vs regression in overparameterized regimes: Does the loss function matter?
Vidya Muthukumar
Adhyyan Narang
Vignesh Subramanian
M. Belkin
Daniel J. Hsu
A. Sahai
43
149
0
16 May 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
24
155
0
13 May 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Lénaïc Chizat
Francis R. Bach
MLT
39
329
0
11 Feb 2020
The Implicit Bias of Depth: How Incremental Learning Drives Generalization
Daniel Gissin
Shai Shalev-Shwartz
Amit Daniely
AI4CE
14
78
0
26 Sep 2019
How Does Learning Rate Decay Help Modern Neural Networks?
Kaichao You
Mingsheng Long
Jianmin Wang
Michael I. Jordan
30
4
0
05 Aug 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
52
325
0
13 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
38
493
0
31 May 2019
Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models
Mor Shpigel Nacson
Suriya Gunasekar
Jason D. Lee
Nathan Srebro
Daniel Soudry
33
92
0
17 May 2019
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods
R. Freund
Paul Grigas
Rahul Mazumder
20
10
0
20 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
28
245
0
12 Oct 2018
On the Learning Dynamics of Deep Neural Networks
Rémi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
16
38
0
18 Sep 2018
1
2
Next