Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1509.01240
Cited By
v1
v2 (latest)
Train faster, generalize better: Stability of stochastic gradient descent
3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Train faster, generalize better: Stability of stochastic gradient descent"
50 / 679 papers shown
Title
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Zhuoning Yuan
Yan Yan
Rong Jin
Tianbao Yang
111
11
0
10 Dec 2018
Deep Frank-Wolfe For Neural Network Optimization
Leonard Berrada
Andrew Zisserman
M. P. Kumar
ODL
67
40
0
19 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
367
1,137
0
09 Nov 2018
Collaborative Filtering with Stability
Dongsheng Li
Chao Chen
Q. Lv
Junchi Yan
Li Shang
Stephen M. Chu
36
0
0
06 Nov 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
287
245
0
12 Oct 2018
Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent
Dominic Richards
Patrick Rebeschini
73
18
0
18 Sep 2018
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration
Xuefeng Gao
Mert Gurbuzbalaban
Lingjiong Zhu
105
60
0
12 Sep 2018
On the Generalization of Stochastic Gradient Descent with Momentum
Ali Ramezani-Kebrya
Kimon Antonakopoulos
Volkan Cevher
Ashish Khisti
Ben Liang
MLT
73
26
0
12 Sep 2018
Decentralized Differentially Private Without-Replacement Stochastic Gradient Descent
Richeng Jin
Xiaofan He
H. Dai
FedML
71
2
0
08 Sep 2018
A Unified Analysis of Stochastic Momentum Methods for Deep Learning
Yan Yan
Tianbao Yang
Zhe Li
Qihang Lin
Yi Yang
59
120
0
30 Aug 2018
Lipschitz regularized Deep Neural Networks generalize and are adversarially robust
Chris Finlay
Jeff Calder
Bilal Abbasi
Adam M. Oberman
106
55
0
28 Aug 2018
Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions
Zaiyi Chen
Zhuoning Yuan
Jinfeng Yi
Bowen Zhou
Enhong Chen
Tianbao Yang
76
58
0
20 Aug 2018
Understanding training and generalization in deep learning by Fourier analysis
Zhi-Qin John Xu
AI4CE
93
94
0
13 Aug 2018
Generalization Error in Deep Learning
Daniel Jakubovitz
Raja Giryes
M. Rodrigues
AI4CE
244
111
0
03 Aug 2018
Machine Learning with Membership Privacy using Adversarial Regularization
Milad Nasr
Reza Shokri
Amir Houmansadr
FedML
MIACV
87
477
0
16 Jul 2018
Training behavior of deep neural network in frequency domain
Zhi-Qin John Xu
Yaoyu Zhang
Yan Xiao
AI4CE
99
320
0
03 Jul 2018
Theory IIIb: Generalization in Deep Networks
T. Poggio
Q. Liao
Alycia Lee
Andrzej Banburski
Xavier Boix
Jack Hidary
ODL
AI4CE
104
26
0
29 Jun 2018
Laplacian Smoothing Gradient Descent
Stanley Osher
Bao Wang
Penghang Yin
Xiyang Luo
Farzin Barekat
Minh Pham
A. Lin
ODL
113
43
0
17 Jun 2018
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
Atsushi Nitanda
Taiji Suzuki
77
10
0
14 Jun 2018
Improving Regression Performance with Distributional Losses
Ehsan Imani
Martha White
UQCV
74
67
0
12 Jun 2018
PAC-Bayes Control: Learning Policies that Provably Generalize to Novel Environments
Anirudha Majumdar
M. Goldstein
Anoopkumar Sonar
104
18
0
11 Jun 2018
Training Faster by Separating Modes of Variation in Batch-normalized Models
Mahdi M. Kalayeh
M. Shah
70
43
0
07 Jun 2018
Implicit regularization and solution uniqueness in over-parameterized matrix sensing
Kelly Geyer
Anastasios Kyrillidis
A. Kalev
126
4
0
06 Jun 2018
Representational Power of ReLU Networks and Polynomial Kernels: Beyond Worst-Case Analysis
Frederic Koehler
Andrej Risteski
47
12
0
29 May 2018
Stable Recurrent Models
John Miller
Moritz Hardt
99
119
0
25 May 2018
How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?
Richard Y. Zhang
C. Josz
Somayeh Sojoudi
Javad Lavaei
70
42
0
25 May 2018
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes
Loucas Pillaud-Vivien
Alessandro Rudi
Francis R. Bach
187
103
0
25 May 2018
Deep learning generalizes because the parameter-function map is biased towards simple functions
Guillermo Valle Pérez
Chico Q. Camargo
A. Louis
MLT
AI4CE
124
232
0
22 May 2018
Measuring and regularizing networks in function space
Ari S. Benjamin
David Rolnick
Konrad Paul Kording
87
140
0
21 May 2018
Stochastic modified equations for the asynchronous stochastic gradient descent
Jing An
Jian-wei Lu
Lexing Ying
77
79
0
21 May 2018
Constrained-CNN losses for weakly supervised segmentation
H. Kervadec
Jose Dolz
Meng Tang
Eric Granger
Yuri Boykov
Ismail Ben Ayed
92
240
0
12 May 2018
SaaS: Speed as a Supervisor for Semi-supervised Learning
Safa Cicek
Alhussein Fawzi
Stefano Soatto
BDL
85
19
0
02 May 2018
Stability of the Stochastic Gradient Method for an Approximated Large Scale Kernel Machine
A. Samareh
Mahshid Salemi Parizi
13
0
0
21 Apr 2018
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRL
OnRL
63
391
0
18 Apr 2018
Regularisation of Neural Networks by Enforcing Lipschitz Continuity
Henry Gouk
E. Frank
Bernhard Pfahringer
M. Cree
208
485
0
12 Apr 2018
Stability and Convergence Trade-off of Iterative Optimization Algorithms
Yuansi Chen
Chi Jin
Bin Yu
71
93
0
04 Apr 2018
Privacy-preserving Prediction
Cynthia Dwork
Vitaly Feldman
82
91
0
27 Mar 2018
Gradient Descent Quantizes ReLU Network Features
Hartmut Maennel
Olivier Bousquet
Sylvain Gelly
MLT
74
82
0
22 Mar 2018
Robust Blind Deconvolution via Mirror Descent
Sathya Ravi
Ronak R. Mehta
Vikas Singh
26
3
0
21 Mar 2018
Constrained Deep Learning using Conditional Gradient and Applications in Computer Vision
Sathya Ravi
Tuan Dinh
Vishnu Suresh Lokhande
Vikas Singh
AI4CE
71
22
0
17 Mar 2018
Model-Agnostic Private Learning via Stability
Raef Bassily
Om Thakkar
Abhradeep Thakurta
FedML
41
12
0
14 Mar 2018
On the Power of Over-parametrization in Neural Networks with Quadratic Activation
S. Du
Jason D. Lee
188
272
0
03 Mar 2018
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
102
119
0
24 Feb 2018
On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition
Marco Mondelli
Andrea Montanari
MLT
CML
96
59
0
20 Feb 2018
Generalization Error Bounds with Probabilistic Guarantee for SGD in Nonconvex Optimization
Yi Zhou
Yingbin Liang
Huishuai Zhang
MLT
87
26
0
19 Feb 2018
An analysis of training and generalization errors in shallow and deep networks
H. Mhaskar
T. Poggio
UQCV
59
18
0
17 Feb 2018
An Alternative View: When Does SGD Escape Local Minima?
Robert D. Kleinberg
Yuanzhi Li
Yang Yuan
MLT
101
317
0
17 Feb 2018
A Progressive Batching L-BFGS Method for Machine Learning
Raghu Bollapragada
Dheevatsa Mudigere
J. Nocedal
Hao-Jun Michael Shi
P. T. P. Tang
ODL
114
153
0
15 Feb 2018
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLT
AI4CE
187
643
0
14 Feb 2018
Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent
Yifan Wu
Barnabás Póczós
Aarti Singh
MLT
46
8
0
13 Feb 2018
Previous
1
2
3
...
11
12
13
14
Next