Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1509.01240
Cited By
v1
v2 (latest)
Train faster, generalize better: Stability of stochastic gradient descent
3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Train faster, generalize better: Stability of stochastic gradient descent"
50 / 679 papers shown
Title
Learning Compact Neural Networks with Regularization
Samet Oymak
MLT
101
39
0
05 Feb 2018
Generalization Error Bounds for Noisy, Iterative Algorithms
Ankit Pensia
Varun Jog
Po-Ling Loh
98
114
0
12 Jan 2018
Theory of Deep Learning III: explaining the non-overfitting puzzle
T. Poggio
Kenji Kawaguchi
Q. Liao
Alycia Lee
Lorenzo Rosasco
Xavier Boix
Jack Hidary
H. Mhaskar
ODL
104
128
0
30 Dec 2017
Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
74
31
0
26 Dec 2017
DropMax: Adaptive Variational Softmax
Haebeom Lee
Juho Lee
Saehoon Kim
Eunho Yang
Sung Ju Hwang
56
13
0
21 Dec 2017
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
107
524
0
20 Dec 2017
Statistical Inference for the Population Landscape via Moment Adjusted Stochastic Gradients
Tengyuan Liang
Weijie Su
72
21
0
20 Dec 2017
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
185
551
0
18 Dec 2017
Mathematics of Deep Learning
René Vidal
Joan Bruna
Raja Giryes
Stefano Soatto
OOD
70
120
0
13 Dec 2017
Online Learning via the Differential Privacy Lens
Jacob D. Abernethy
Young Hun Jung
Chansoo Lee
Audra McMillan
Ambuj Tewari
42
13
0
27 Nov 2017
Regularization for Deep Learning: A Taxonomy
J. Kukačka
Vladimir Golkov
Zorah Lähner
101
336
0
29 Oct 2017
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
263
926
0
27 Oct 2017
Stability and Generalization of Learning Algorithms that Converge to Global Optima
Zachary B. Charles
Dimitris Papailiopoulos
MLT
75
163
0
23 Oct 2017
Function Norms and Regularization in Deep Networks
Amal Rannen Triki
Maxim Berman
Matthew B. Blaschko
78
2
0
18 Oct 2017
Generalization in Deep Learning
Kenji Kawaguchi
L. Kaelbling
Yoshua Bengio
ODL
216
460
0
16 Oct 2017
A PAC-Bayesian Analysis of Randomized Learning with Application to Stochastic Gradient Descent
Ben London
100
79
0
19 Sep 2017
The Impact of Local Geometry and Batch Size on Stochastic Gradient Descent for Nonconvex Problems
V. Patel
MLT
73
8
0
14 Sep 2017
Stochastic Gradient Descent: Going As Fast As Possible But Not Faster
Alice Schoenauer Sebag
Marc Schoenauer
Michèle Sebag
45
11
0
05 Sep 2017
Convergence of Unregularized Online Learning Algorithms
Yunwen Lei
Lei Shi
Zheng-Chu Guo
89
14
0
09 Aug 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
183
1,099
0
07 Aug 2017
A Robust Multi-Batch L-BFGS Method for Machine Learning
A. Berahas
Martin Takáč
AAML
ODL
113
44
0
26 Jul 2017
Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints
Wenlong Mou
Liwei Wang
Xiyu Zhai
Kai Zheng
MLT
75
159
0
19 Jul 2017
Towards Understanding Generalization of Deep Learning: Perspective of Loss Landscapes
Lei Wu
Zhanxing Zhu
E. Weinan
ODL
71
220
0
30 Jun 2017
Exploring Generalization in Deep Learning
Behnam Neyshabur
Srinadh Bhojanapalli
David A. McAllester
Nathan Srebro
FAtt
205
1,261
0
27 Jun 2017
Gradient Diversity: a Key Ingredient for Scalable Distributed Learning
Dong Yin
A. Pananjady
Max Lam
Dimitris Papailiopoulos
Kannan Ramchandran
Peter L. Bartlett
99
11
0
18 Jun 2017
A Closer Look at Memorization in Deep Networks
Devansh Arpit
Stanislaw Jastrzebski
Nicolas Ballas
David M. Krueger
Emmanuel Bengio
...
Tegan Maharaj
Asja Fischer
Aaron Courville
Yoshua Bengio
Simon Lacoste-Julien
TDI
206
1,834
0
16 Jun 2017
Stochastic Training of Neural Networks via Successive Convex Approximations
Simone Scardapane
Paolo Di Lorenzo
43
9
0
15 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
209
337
0
10 Jun 2017
Are Saddles Good Enough for Deep Learning?
Adepu Ravi Sankar
V. Balasubramanian
65
5
0
07 Jun 2017
Deep Learning: Generalization Requires Deep Compositional Feature Space Design
Mrinal Haloi
MLT
OOD
34
3
0
06 Jun 2017
Classification regions of deep neural networks
Alhussein Fawzi
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
Stefano Soatto
86
51
0
26 May 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
207
803
0
24 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
125
1,035
0
23 May 2017
Bandit Structured Prediction for Neural Sequence-to-Sequence Learning
Julia Kreutzer
Artem Sokolov
Stefan Riezler
85
49
0
21 Apr 2017
Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parameters than Training Data
Gintare Karolina Dziugaite
Daniel M. Roy
128
820
0
31 Mar 2017
Efficient Private ERM for Smooth Objectives
Jiaqi Zhang
Kai Zheng
Wenlong Mou
Liwei Wang
62
145
0
29 Mar 2017
Sharp Minima Can Generalize For Deep Nets
Laurent Dinh
Razvan Pascanu
Samy Bengio
Yoshua Bengio
ODL
147
774
0
15 Mar 2017
Data-Dependent Stability of Stochastic Gradient Descent
Ilja Kuzborskij
Christoph H. Lampert
MLT
155
166
0
05 Mar 2017
Algorithmic stability and hypothesis complexity
Tongliang Liu
Gábor Lugosi
Gergely Neu
Dacheng Tao
101
92
0
28 Feb 2017
On architectural choices in deep learning: From network structure to gradient convergence and parameter estimation
V. Ithapu
Sathya Ravi
Vikas Singh
AI4CE
85
9
0
28 Feb 2017
Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis
Maxim Raginsky
Alexander Rakhlin
Matus Telgarsky
88
521
0
13 Feb 2017
Fast Rates for Empirical Risk Minimization of Strict Saddle Problems
Alon Gonen
Shai Shalev-Shwartz
124
30
0
16 Jan 2017
Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond
Levent Sagun
Léon Bottou
Yann LeCun
UQCV
108
235
0
22 Nov 2016
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
383
4,641
0
10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
129
775
0
06 Nov 2016
Deep Information Propagation
S. Schoenholz
Justin Gilmer
Surya Ganguli
Jascha Narain Sohl-Dickstein
128
371
0
04 Nov 2016
Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods
A. Gautier
Quynh N. Nguyen
Matthias Hein
142
32
0
28 Oct 2016
Learning Scalable Deep Kernels with Recurrent Structure
Maruan Al-Shedivat
A. Wilson
Yunus Saatchi
Zhiting Hu
Eric Xing
BDL
106
106
0
27 Oct 2016
Membership Inference Attacks against Machine Learning Models
Reza Shokri
M. Stronati
Congzheng Song
Vitaly Shmatikov
SLR
MIALM
MIACV
333
4,177
0
18 Oct 2016
Generalization Error Bounds for Optimization Algorithms via Stability
Qi Meng
Yue Wang
Wei-neng Chen
Taifeng Wang
Zhiming Ma
Tie-Yan Liu
38
8
0
27 Sep 2016
Previous
1
2
3
...
12
13
14
Next