ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.05890
  4. Cited By
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

13 June 2019
Kaifeng Lyu
Jian Li
ArXivPDFHTML

Papers citing "Gradient Descent Maximizes the Margin of Homogeneous Neural Networks"

46 / 246 papers shown
Title
Provable Generalization of SGD-trained Neural Networks of Any Width in
  the Presence of Adversarial Label Noise
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
Spencer Frei
Yuan Cao
Quanquan Gu
FedML
MLT
64
19
0
04 Jan 2021
Explicit regularization and implicit bias in deep network classifiers
  trained with the square loss
Explicit regularization and implicit bias in deep network classifiers trained with the square loss
T. Poggio
Q. Liao
11
41
0
31 Dec 2020
Towards Resolving the Implicit Bias of Gradient Descent for Matrix
  Factorization: Greedy Low-Rank Learning
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning
Zhiyuan Li
Yuping Luo
Kaifeng Lyu
20
120
0
17 Dec 2020
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous
  Neural Networks
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
27
33
0
11 Dec 2020
Implicit Regularization in ReLU Networks with the Square Loss
Implicit Regularization in ReLU Networks with the Square Loss
Gal Vardi
Ohad Shamir
11
48
0
09 Dec 2020
Implicit bias of deep linear networks in the large learning rate phase
Implicit bias of deep linear networks in the large learning rate phase
Wei Huang
Weitao Du
R. Xu
Chunrui Liu
24
2
0
25 Nov 2020
Implicit bias of any algorithm: bounding bias via margin
Implicit bias of any algorithm: bounding bias via margin
Elvis Dohmatob
9
0
0
12 Nov 2020
Inductive Bias of Gradient Descent for Weight Normalized Smooth
  Homogeneous Neural Nets
Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets
Depen Morwani
H. G. Ramaswamy
9
3
0
24 Oct 2020
Train simultaneously, generalize better: Stability of gradient-based
  minimax learners
Train simultaneously, generalize better: Stability of gradient-based minimax learners
Farzan Farnia
Asuman Ozdaglar
31
47
0
23 Oct 2020
Precise Statistical Analysis of Classification Accuracies for
  Adversarial Training
Precise Statistical Analysis of Classification Accuracies for Adversarial Training
Adel Javanmard
Mahdi Soltanolkotabi
AAML
26
62
0
21 Oct 2020
Effects of Parameter Norm Growth During Transformer Training: Inductive
  Bias from Gradient Descent
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent
William Merrill
Vivek Ramanujan
Yoav Goldberg
Roy Schwartz
Noah A. Smith
AI4CE
11
36
0
19 Oct 2020
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed
  Gradients
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients
Juntang Zhuang
Tommy M. Tang
Yifan Ding
S. Tatikonda
Nicha Dvornek
X. Papademetris
James S. Duncan
ODL
14
501
0
15 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
13
80
0
06 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
8
23
0
04 Oct 2020
Understanding Implicit Regularization in Over-Parameterized Single Index
  Model
Understanding Implicit Regularization in Over-Parameterized Single Index Model
Jianqing Fan
Zhuoran Yang
Mengxin Yu
24
16
0
16 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs
  Training Accuracy
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
J. Lee
Nathan Srebro
Daniel Soudry
35
85
0
13 Jul 2020
Regularization Matters: A Nonparametric Perspective on Overparametrized
  Neural Network
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network
Tianyang Hu
Wenjia Wang
Cong Lin
Guang Cheng
14
51
0
06 Jul 2020
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural
  Networks
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks
Wei Hu
Lechao Xiao
Ben Adlam
Jeffrey Pennington
23
62
0
25 Jun 2020
Implicitly Maximizing Margins with the Hinge Loss
Implicitly Maximizing Margins with the Hinge Loss
Justin Lizama
13
1
0
25 Jun 2020
Gradient descent follows the regularization path for general losses
Gradient descent follows the regularization path for general losses
Ziwei Ji
Miroslav Dudík
Robert Schapire
Matus Telgarsky
AI4CE
FaML
6
60
0
19 Jun 2020
When Does Preconditioning Help or Hurt Generalization?
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
36
32
0
18 Jun 2020
Directional Pruning of Deep Neural Networks
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
15
33
0
16 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
J. Lee
Tengyu Ma
29
93
0
15 Jun 2020
Generalization by Recognizing Confusion
Generalization by Recognizing Confusion
Daniel Chiu
Franklyn Wang
S. Kominers
NoLa
11
0
0
13 Jun 2020
Directional convergence and alignment in deep learning
Directional convergence and alignment in deep learning
Ziwei Ji
Matus Telgarsky
12
162
0
11 Jun 2020
Structure preserving deep learning
Structure preserving deep learning
E. Celledoni
Matthias Joachim Ehrhardt
Christian Etmann
R. McLachlan
B. Owren
Carola-Bibiane Schönlieb
Ferdia Sherry
AI4CE
15
44
0
05 Jun 2020
Is deeper better? It depends on locality of relevant features
Is deeper better? It depends on locality of relevant features
Takashi Mori
Masahito Ueda
OOD
19
4
0
26 May 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
24
155
0
13 May 2020
A function space analysis of finite neural networks with insights from
  sampling theory
A function space analysis of finite neural networks with insights from sampling theory
Raja Giryes
19
6
0
15 Apr 2020
Mirrorless Mirror Descent: A Natural Derivation of Mirror Descent
Mirrorless Mirror Descent: A Natural Derivation of Mirror Descent
Suriya Gunasekar
Blake E. Woodworth
Nathan Srebro
MDE
19
28
0
02 Apr 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLT
AI4CE
16
4
0
22 Feb 2020
On the Decision Boundaries of Neural Networks: A Tropical Geometry
  Perspective
On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective
Motasem Alfarra
Adel Bibi
Hasan Hammoud
M. Gaafar
Guohao Li
11
26
0
20 Feb 2020
Unique Properties of Flat Minima in Deep Networks
Unique Properties of Flat Minima in Deep Networks
Rotem Mulayoff
T. Michaeli
ODL
19
4
0
11 Feb 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Lénaïc Chizat
Francis R. Bach
MLT
21
327
0
11 Feb 2020
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural
  Networks
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks
Zixiang Chen
Yuan Cao
Quanquan Gu
Tong Zhang
MLT
27
10
0
10 Feb 2020
Reward Tweaking: Maximizing the Total Reward While Planning for Short
  Horizons
Reward Tweaking: Maximizing the Total Reward While Planning for Short Horizons
Chen Tessler
Shie Mannor
14
2
0
09 Feb 2020
Sharp Rate of Convergence for Deep Neural Network Classifiers under the
  Teacher-Student Setting
Sharp Rate of Convergence for Deep Neural Network Classifiers under the Teacher-Student Setting
Tianyang Hu
Zuofeng Shang
Guang Cheng
27
19
0
19 Jan 2020
Double descent in the condition number
Double descent in the condition number
T. Poggio
Gil Kur
Andy Banburski
19
27
0
12 Dec 2019
How Much Over-parameterization Is Sufficient to Learn Deep ReLU
  Networks?
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Zixiang Chen
Yuan Cao
Difan Zou
Quanquan Gu
14
122
0
27 Nov 2019
Improved Sample Complexities for Deep Networks and Robust Classification
  via an All-Layer Margin
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAML
OOD
36
85
0
09 Oct 2019
A Function Space View of Bounded Norm Infinite Width ReLU Nets: The
  Multivariate Case
A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case
Greg Ongie
Rebecca Willett
Daniel Soudry
Nathan Srebro
13
160
0
03 Oct 2019
Theoretical Issues in Deep Networks: Approximation, Optimization and
  Generalization
Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization
T. Poggio
Andrzej Banburski
Q. Liao
ODL
29
161
0
25 Aug 2019
Interpolated Adversarial Training: Achieving Robust Neural Networks
  without Sacrificing Too Much Accuracy
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy
Alex Lamb
Vikas Verma
Kenji Kawaguchi
Alexander Matyasko
Savya Khosla
Arno Solin
Yoshua Bengio
AAML
30
98
0
16 Jun 2019
Kernel and Rich Regimes in Overparametrized Models
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
J. Lee
Daniel Soudry
Nathan Srebro
21
352
0
13 Jun 2019
Theory III: Dynamics and Generalization in Deep Networks
Theory III: Dynamics and Generalization in Deep Networks
Andrzej Banburski
Q. Liao
Brando Miranda
Lorenzo Rosasco
Fernanda De La Torre
Jack Hidary
T. Poggio
AI4CE
27
3
0
12 Mar 2019
Approximation by Combinations of ReLU and Squared ReLU Ridge Functions
  with $ \ell^1 $ and $ \ell^0 $ Controls
Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with ℓ1 \ell^1 ℓ1 and ℓ0 \ell^0 ℓ0 Controls
Jason M. Klusowski
Andrew R. Barron
130
142
0
26 Jul 2016
Previous
12345