ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.01905
  4. Cited By
Convergence of Gradient Descent on Separable Data

Convergence of Gradient Descent on Separable Data

5 March 2018
Mor Shpigel Nacson
Jason D. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
ArXivPDFHTML

Papers citing "Convergence of Gradient Descent on Separable Data"

50 / 53 papers shown
Title
Embedding principle of homogeneous neural network for classification problem
Embedding principle of homogeneous neural network for classification problem
Jiahan Zhang
Yaoyu Zhang
Tao Luo
17
0
0
18 May 2025
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
How Transformers Learn Regular Language Recognition: A Theoretical Study on Training Dynamics and Implicit Bias
Ruiquan Huang
Yingbin Liang
Jing Yang
60
0
0
02 May 2025
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
33
0
0
05 Apr 2025
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representations
Yize Zhao
Tina Behnia
V. Vakilian
Christos Thrampoulidis
70
9
0
20 Feb 2025
The late-stage training dynamics of (stochastic) subgradient descent on homogeneous neural networks
Sholom Schechtman
Nicolas Schreuder
236
0
0
08 Feb 2025
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
Implicit Bias of AdamW: ℓ∞\ell_\inftyℓ∞​ Norm Constrained Optimization
Shuo Xie
Zhiyuan Li
OffRL
50
13
0
05 Apr 2024
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent
  on Language Models
Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models
Frederik Kunstner
Robin Yadav
Alan Milligan
Mark Schmidt
Alberto Bietti
46
26
0
29 Feb 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
39
15
0
08 Feb 2024
Gradient Descent Converges Linearly for Logistic Regression on Separable
  Data
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Kyriakos Axiotis
M. Sviridenko
MLT
10
3
0
26 Jun 2023
The Implicit Bias of Batch Normalization in Linear Models and Two-layer
  Linear Convolutional Neural Networks
The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks
Yuan Cao
Difan Zou
Yuan-Fang Li
Quanquan Gu
MLT
37
5
0
20 Jun 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable
  Data
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
  Stability
Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability
Jingfeng Wu
Vladimir Braverman
Jason D. Lee
32
17
0
19 May 2023
General Loss Functions Lead to (Approximate) Interpolation in High
  Dimensions
General Loss Functions Lead to (Approximate) Interpolation in High Dimensions
Kuo-Wei Lai
Vidya Muthukumar
34
5
0
13 Mar 2023
Generalization and Stability of Interpolating Neural Networks with
  Minimal Width
Generalization and Stability of Interpolating Neural Networks with Minimal Width
Hossein Taheri
Christos Thrampoulidis
40
16
0
18 Feb 2023
Convergence beyond the over-parameterized regime using Rayleigh
  quotients
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
27
3
0
19 Jan 2023
Iterative regularization in classification via hinge loss diagonal
  descent
Iterative regularization in classification via hinge loss diagonal descent
Vassilis Apidopoulos
T. Poggio
Lorenzo Rosasco
S. Villa
29
2
0
24 Dec 2022
Mechanistic Mode Connectivity
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
34
45
0
15 Nov 2022
Importance Tempering: Group Robustness for Overparameterized Models
Importance Tempering: Group Robustness for Overparameterized Models
Yiping Lu
Wenlong Ji
Zachary Izzo
Lexing Ying
50
7
0
19 Sep 2022
On Generalization of Decentralized Learning with Separable Data
On Generalization of Decentralized Learning with Separable Data
Hossein Taheri
Christos Thrampoulidis
FedML
42
11
0
15 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
36
73
0
26 Aug 2022
Imbalance Trouble: Revisiting Neural-Collapse Geometry
Imbalance Trouble: Revisiting Neural-Collapse Geometry
Christos Thrampoulidis
Ganesh Ramachandra Kini
V. Vakilian
Tina Behnia
35
70
0
10 Aug 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
45
28
0
08 Jul 2022
Stability vs Implicit Bias of Gradient Methods on Separable Data and
  Beyond
Stability vs Implicit Bias of Gradient Methods on Separable Data and Beyond
Matan Schliserman
Tomer Koren
24
23
0
27 Feb 2022
Implicit Regularization in Hierarchical Tensor Factorization and Deep
  Convolutional Neural Networks
Implicit Regularization in Hierarchical Tensor Factorization and Deep Convolutional Neural Networks
Noam Razin
Asaf Maman
Nadav Cohen
49
29
0
27 Jan 2022
On generalization bounds for deep networks based on loss surface
  implicit regularization
On generalization bounds for deep networks based on loss surface implicit regularization
Masaaki Imaizumi
Johannes Schmidt-Hieber
ODL
34
3
0
12 Jan 2022
On Margin Maximization in Linear and ReLU Networks
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
52
28
0
06 Oct 2021
On Large-Cohort Training for Federated Learning
On Large-Cohort Training for Federated Learning
Zachary B. Charles
Zachary Garrett
Zhouyuan Huo
Sergei Shmulyian
Virginia Smith
FedML
21
113
0
15 Jun 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
A Geometric Analysis of Neural Collapse with Unconstrained Features
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
40
197
0
06 May 2021
The Low-Rank Simplicity Bias in Deep Networks
The Low-Rank Simplicity Bias in Deep Networks
Minyoung Huh
H. Mobahi
Richard Y. Zhang
Brian Cheung
Pulkit Agrawal
Phillip Isola
35
110
0
18 Mar 2021
Label-Imbalanced and Group-Sensitive Classification under
  Overparameterization
Label-Imbalanced and Group-Sensitive Classification under Overparameterization
Ganesh Ramachandra Kini
Orestis Paraskevas
Samet Oymak
Christos Thrampoulidis
33
93
0
02 Mar 2021
Dissecting Supervised Contrastive Learning
Dissecting Supervised Contrastive Learning
Florian Graf
Christoph Hofer
Marc Niethammer
Roland Kwitt
SSL
117
70
0
17 Feb 2021
Connecting Interpretability and Robustness in Decision Trees through
  Separation
Connecting Interpretability and Robustness in Decision Trees through Separation
Michal Moshkovitz
Yao-Yuan Yang
Kamalika Chaudhuri
33
22
0
14 Feb 2021
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous
  Neural Networks
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
30
33
0
11 Dec 2020
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent
  with Moderate Learning Rate
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu
Difan Zou
Vladimir Braverman
Quanquan Gu
18
18
0
04 Nov 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
23
80
0
06 Oct 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs
  Training Accuracy
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
Jason D. Lee
Nathan Srebro
Daniel Soudry
35
85
0
13 Jul 2020
An analytic theory of shallow networks dynamics for hinge loss
  classification
An analytic theory of shallow networks dynamics for hinge loss classification
Franco Pellegrini
Giulio Biroli
35
19
0
19 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
32
94
0
15 Jun 2020
To Each Optimizer a Norm, To Each Norm its Generalization
To Each Optimizer a Norm, To Each Norm its Generalization
Sharan Vaswani
Reza Babanezhad
Jose Gallego
Aaron Mishkin
Simon Lacoste-Julien
Nicolas Le Roux
26
8
0
11 Jun 2020
Classification vs regression in overparameterized regimes: Does the loss
  function matter?
Classification vs regression in overparameterized regimes: Does the loss function matter?
Vidya Muthukumar
Adhyyan Narang
Vignesh Subramanian
M. Belkin
Daniel J. Hsu
A. Sahai
43
149
0
16 May 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
24
155
0
13 May 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Lénaïc Chizat
Francis R. Bach
MLT
39
329
0
11 Feb 2020
The Implicit Bias of Depth: How Incremental Learning Drives
  Generalization
The Implicit Bias of Depth: How Incremental Learning Drives Generalization
Daniel Gissin
Shai Shalev-Shwartz
Amit Daniely
AI4CE
14
78
0
26 Sep 2019
How Does Learning Rate Decay Help Modern Neural Networks?
How Does Learning Rate Decay Help Modern Neural Networks?
Kaichao You
Mingsheng Long
Jianmin Wang
Michael I. Jordan
30
4
0
05 Aug 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
52
325
0
13 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
38
493
0
31 May 2019
Lexicographic and Depth-Sensitive Margins in Homogeneous and
  Non-Homogeneous Deep Models
Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models
Mor Shpigel Nacson
Suriya Gunasekar
Jason D. Lee
Nathan Srebro
Daniel Soudry
33
92
0
17 May 2019
Condition Number Analysis of Logistic Regression, and its Implications
  for Standard First-Order Solution Methods
Condition Number Analysis of Logistic Regression, and its Implications for Standard First-Order Solution Methods
R. Freund
Paul Grigas
Rahul Mazumder
20
10
0
20 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets
  v.s. their Induced Kernel
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
28
245
0
12 Oct 2018
On the Learning Dynamics of Deep Neural Networks
On the Learning Dynamics of Deep Neural Networks
Rémi Tachet des Combes
Mohammad Pezeshki
Samira Shabanian
Aaron Courville
Yoshua Bengio
16
38
0
18 Sep 2018
12
Next