ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.06821
  4. Cited By
To Each Optimizer a Norm, To Each Norm its Generalization

To Each Optimizer a Norm, To Each Norm its Generalization

11 June 2020
Sharan Vaswani
Reza Babanezhad
Jose Gallego
Aaron Mishkin
Simon Lacoste-Julien
Nicolas Le Roux
ArXivPDFHTML

Papers citing "To Each Optimizer a Norm, To Each Norm its Generalization"

26 / 26 papers shown
Title
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Implicit Regularization in Deep Learning May Not Be Explainable by Norms
Noam Razin
Nadav Cohen
34
155
0
13 May 2020
Finite-sample Analysis of Interpolating Linear Classifiers in the
  Overparameterized Regime
Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime
Niladri S. Chatterji
Philip M. Long
18
109
0
25 Apr 2020
BackPACK: Packing more into backprop
BackPACK: Packing more into backprop
Felix Dangel
Frederik Kunstner
Philipp Hennig
ODL
28
103
0
23 Dec 2019
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Sanjeev Arora
S. Du
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
Dingli Yu
AAML
29
162
0
03 Oct 2019
Bias of Homotopic Gradient Descent for the Hinge Loss
Bias of Homotopic Gradient Descent for the Hinge Loss
Denali Molitor
Deanna Needell
Rachel A. Ward
22
5
0
26 Jul 2019
Benign Overfitting in Linear Regression
Benign Overfitting in Linear Regression
Peter L. Bartlett
Philip M. Long
Gábor Lugosi
Alexander Tsigler
MLT
34
769
0
26 Jun 2019
The Implicit Bias of AdaGrad on Separable Data
The Implicit Bias of AdaGrad on Separable Data
Qian Qian
Xiaoyuan Qian
37
23
0
09 Jun 2019
Implicit Regularization in Deep Matrix Factorization
Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora
Nadav Cohen
Wei Hu
Yuping Luo
AI4CE
52
500
0
31 May 2019
The Effect of Network Width on Stochastic Gradient Descent and
  Generalization: an Empirical Study
The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study
Daniel S. Park
Jascha Narain Sohl-Dickstein
Quoc V. Le
Samuel L. Smith
48
57
0
09 May 2019
Harmless interpolation of noisy data in regression
Harmless interpolation of noisy data in regression
Vidya Muthukumar
Kailas Vodrahalli
Vignesh Subramanian
A. Sahai
38
204
0
21 Mar 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Trevor Hastie
Andrea Montanari
Saharon Rosset
Robert Tibshirani
73
737
0
19 Mar 2019
Reconciling modern machine learning practice and the bias-variance
  trade-off
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
137
1,628
0
28 Dec 2018
Gradient descent aligns the layers of deep linear networks
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
81
250
0
04 Oct 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
117
3,160
0
20 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a
  Fixed Learning Rate
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedML
MLT
41
100
0
05 Jun 2018
Implicit Bias of Gradient Descent on Linear Convolutional Networks
Implicit Bias of Gradient Descent on Linear Convolutional Networks
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
MDE
34
408
0
01 Jun 2018
Convergence of Gradient Descent on Separable Data
Convergence of Gradient Descent on Separable Data
Mor Shpigel Nacson
Jason D. Lee
Suriya Gunasekar
Pedro H. P. Savarese
Nathan Srebro
Daniel Soudry
42
167
0
05 Mar 2018
Characterizing Implicit Bias in Terms of Optimization Geometry
Characterizing Implicit Bias in Terms of Optimization Geometry
Suriya Gunasekar
Jason D. Lee
Daniel Soudry
Nathan Srebro
AI4CE
55
404
0
22 Feb 2018
To understand deep learning we need to understand kernel learning
To understand deep learning we need to understand kernel learning
M. Belkin
Siyuan Ma
Soumik Mandal
25
414
0
05 Feb 2018
Improving Generalization Performance by Switching from Adam to SGD
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
57
522
0
20 Dec 2017
The Implicit Bias of Gradient Descent on Separable Data
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
51
908
0
27 Oct 2017
Implicit Regularization in Matrix Factorization
Implicit Regularization in Matrix Factorization
Suriya Gunasekar
Blake E. Woodworth
Srinadh Bhojanapalli
Behnam Neyshabur
Nathan Srebro
50
490
0
25 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
39
1,023
0
23 May 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
203
4,612
0
10 Nov 2016
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
262
149,474
0
22 Dec 2014
Sublinear Optimization for Machine Learning
Sublinear Optimization for Machine Learning
K. Clarkson
Elad Hazan
David P. Woodruff
45
138
0
21 Oct 2010
1