ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent
v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXiv (abs)PDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown
Title
Learning Compact Neural Networks with Regularization
Learning Compact Neural Networks with Regularization
Samet Oymak
MLT
101
39
0
05 Feb 2018
Generalization Error Bounds for Noisy, Iterative Algorithms
Generalization Error Bounds for Noisy, Iterative Algorithms
Ankit Pensia
Varun Jog
Po-Ling Loh
98
114
0
12 Jan 2018
Theory of Deep Learning III: explaining the non-overfitting puzzle
Theory of Deep Learning III: explaining the non-overfitting puzzle
T. Poggio
Kenji Kawaguchi
Q. Liao
Alycia Lee
Lorenzo Rosasco
Xavier Boix
Jack Hidary
H. Mhaskar
ODL
104
128
0
30 Dec 2017
Algorithmic Regularization in Over-parameterized Matrix Sensing and
  Neural Networks with Quadratic Activations
Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
74
31
0
26 Dec 2017
DropMax: Adaptive Variational Softmax
DropMax: Adaptive Variational Softmax
Haebeom Lee
Juho Lee
Saehoon Kim
Eunho Yang
Sung Ju Hwang
56
13
0
21 Dec 2017
Improving Generalization Performance by Switching from Adam to SGD
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
107
524
0
20 Dec 2017
Statistical Inference for the Population Landscape via Moment Adjusted
  Stochastic Gradients
Statistical Inference for the Population Landscape via Moment Adjusted Stochastic Gradients
Tengyuan Liang
Weijie Su
72
21
0
20 Dec 2017
Size-Independent Sample Complexity of Neural Networks
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
185
551
0
18 Dec 2017
Mathematics of Deep Learning
Mathematics of Deep Learning
René Vidal
Joan Bruna
Raja Giryes
Stefano Soatto
OOD
70
120
0
13 Dec 2017
Online Learning via the Differential Privacy Lens
Online Learning via the Differential Privacy Lens
Jacob D. Abernethy
Young Hun Jung
Chansoo Lee
Audra McMillan
Ambuj Tewari
42
13
0
27 Nov 2017
Regularization for Deep Learning: A Taxonomy
Regularization for Deep Learning: A Taxonomy
J. Kukačka
Vladimir Golkov
Zorah Lähner
101
336
0
29 Oct 2017
The Implicit Bias of Gradient Descent on Separable Data
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
263
926
0
27 Oct 2017
Stability and Generalization of Learning Algorithms that Converge to
  Global Optima
Stability and Generalization of Learning Algorithms that Converge to Global Optima
Zachary B. Charles
Dimitris Papailiopoulos
MLT
75
163
0
23 Oct 2017
Function Norms and Regularization in Deep Networks
Function Norms and Regularization in Deep Networks
Amal Rannen Triki
Maxim Berman
Matthew B. Blaschko
78
2
0
18 Oct 2017
Generalization in Deep Learning
Generalization in Deep Learning
Kenji Kawaguchi
L. Kaelbling
Yoshua Bengio
ODL
216
460
0
16 Oct 2017
A PAC-Bayesian Analysis of Randomized Learning with Application to
  Stochastic Gradient Descent
A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent
Ben London
100
79
0
19 Sep 2017
The Impact of Local Geometry and Batch Size on Stochastic Gradient
  Descent for Nonconvex Problems
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems
V. Patel
MLT
73
8
0
14 Sep 2017
Stochastic Gradient Descent: Going As Fast As Possible But Not Faster
Stochastic Gradient Descent: Going As Fast As Possible But Not Faster
Alice Schoenauer Sebag
Marc Schoenauer
Michèle Sebag
45
11
0
05 Sep 2017
Convergence of Unregularized Online Learning Algorithms
Convergence of Unregularized Online Learning Algorithms
Yunwen Lei
Lei Shi
Zheng-Chu Guo
89
14
0
09 Aug 2017
Regularizing and Optimizing LSTM Language Models
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
183
1,099
0
07 Aug 2017
A Robust Multi-Batch L-BFGS Method for Machine Learning
A Robust Multi-Batch L-BFGS Method for Machine Learning
A. Berahas
Martin Takáč
AAMLODL
113
44
0
26 Jul 2017
Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical
  Viewpoints
Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
Wenlong Mou
Liwei Wang
Xiyu Zhai
Kai Zheng
MLT
75
159
0
19 Jul 2017
Towards Understanding Generalization of Deep Learning: Perspective of
  Loss Landscapes
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
Lei Wu
Zhanxing Zhu
E. Weinan
ODL
71
220
0
30 Jun 2017
Exploring Generalization in Deep Learning
Exploring Generalization in Deep Learning
Behnam Neyshabur
Srinadh Bhojanapalli
David A. McAllester
Nathan Srebro
FAtt
205
1,261
0
27 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Dong Yin
A. Pananjady
Max Lam
Dimitris Papailiopoulos
Kannan Ramchandran
Peter L. Bartlett
99
11
0
18 Jun 2017
A Closer Look at Memorization in Deep Networks
A Closer Look at Memorization in Deep Networks
Devansh Arpit
Stanislaw Jastrzebski
Nicolas Ballas
David M. Krueger
Emmanuel Bengio
...
Tegan Maharaj
Asja Fischer
Aaron Courville
Yoshua Bengio
Simon Lacoste-Julien
TDI
206
1,834
0
16 Jun 2017
Stochastic Training of Neural Networks via Successive Convex
  Approximations
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
Paolo Di Lorenzo
43
9
0
15 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
209
337
0
10 Jun 2017
Are Saddles Good Enough for Deep Learning?
Are Saddles Good Enough for Deep Learning?
Adepu Ravi Sankar
V. Balasubramanian
65
5
0
07 Jun 2017
Deep Learning: Generalization Requires Deep Compositional Feature Space
  Design
Deep Learning: Generalization Requires Deep Compositional Feature Space Design
Mrinal Haloi
MLTOOD
34
3
0
06 Jun 2017
Classification regions of deep neural networks
Classification regions of deep neural networks
Alhussein Fawzi
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
Stefano Soatto
86
51
0
26 May 2017
Train longer, generalize better: closing the generalization gap in large
  batch training of neural networks
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
207
803
0
24 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
125
1,035
0
23 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Julia Kreutzer
Artem Sokolov
Stefan Riezler
85
49
0
21 Apr 2017
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural
  Networks with Many More Parameters than Training Data
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
Gintare Karolina Dziugaite
Daniel M. Roy
128
820
0
31 Mar 2017
Efficient Private ERM for Smooth Objectives
Efficient Private ERM for Smooth Objectives
Jiaqi Zhang
Kai Zheng
Wenlong Mou
Liwei Wang
62
145
0
29 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
147
774
0
15 Mar 2017
Data-Dependent Stability of Stochastic Gradient Descent
Data-Dependent Stability of Stochastic Gradient Descent
Ilja Kuzborskij
Christoph H. Lampert
MLT
155
166
0
05 Mar 2017
Algorithmic stability and hypothesis complexity
Algorithmic stability and hypothesis complexity
Tongliang Liu
Gábor Lugosi
Gergely Neu
Dacheng Tao
101
92
0
28 Feb 2017
On architectural choices in deep learning: From network structure to
  gradient convergence and parameter estimation
On architectural choices in deep learning: From network structure to gradient convergence and parameter estimation
V. Ithapu
Sathya Ravi
Vikas Singh
AI4CE
85
9
0
28 Feb 2017
Non-convex learning via Stochastic Gradient Langevin Dynamics: a
  nonasymptotic analysis
Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis
Maxim Raginsky
Alexander Rakhlin
Matus Telgarsky
88
521
0
13 Feb 2017
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems
Alon Gonen
Shai Shalev-Shwartz
124
30
0
16 Jan 2017
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
108
235
0
22 Nov 2016
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
383
4,641
0
10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
129
775
0
06 Nov 2016
Deep Information Propagation
Deep Information Propagation
S. Schoenholz
Justin Gilmer
Surya Ganguli
Jascha Narain Sohl-Dickstein
128
371
0
04 Nov 2016
Globally Optimal Training of Generalized Polynomial Neural Networks with
  Nonlinear Spectral Methods
Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods
A. Gautier
Quynh N. Nguyen
Matthias Hein
142
32
0
28 Oct 2016
Learning Scalable Deep Kernels with Recurrent Structure
Learning Scalable Deep Kernels with Recurrent Structure
Maruan Al-Shedivat
A. Wilson
Yunus Saatchi
Zhiting Hu
Eric Xing
BDL
106
106
0
27 Oct 2016
Membership Inference Attacks against Machine Learning Models
Membership Inference Attacks against Machine Learning Models
Reza Shokri
M. Stronati
Congzheng Song
Vitaly Shmatikov
SLRMIALMMIACV
333
4,177
0
18 Oct 2016
Generalization Error Bounds for Optimization Algorithms via Stability
Generalization Error Bounds for Optimization Algorithms via Stability
Qi Meng
Yue Wang
Wei-neng Chen
Taifeng Wang
Zhiming Ma
Tie-Yan Liu
38
8
0
27 Sep 2016
Previous
123...121314
Next