ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent
v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXiv (abs)PDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown
Title
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Stagewise Training Accelerates Convergence of Testing Error Over SGD
Zhuoning Yuan
Yan Yan
Rong Jin
Tianbao Yang
111
11
0
10 Dec 2018
Deep Frank-Wolfe For Neural Network Optimization
Deep Frank-Wolfe For Neural Network Optimization
Leonard Berrada
Andrew Zisserman
M. P. Kumar
ODL
67
40
0
19 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
367
1,137
0
09 Nov 2018
Collaborative Filtering with Stability
Collaborative Filtering with Stability
Dongsheng Li
Chao Chen
Q. Lv
Junchi Yan
Li Shang
Stephen M. Chu
36
0
0
06 Nov 2018
Regularization Matters: Generalization and Optimization of Neural Nets
  v.s. their Induced Kernel
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
287
245
0
12 Oct 2018
Graph-Dependent Implicit Regularisation for Distributed Stochastic
  Subgradient Descent
Graph-Dependent Implicit Regularisation for Distributed Stochastic Subgradient Descent
Dominic Richards
Patrick Rebeschini
73
18
0
18 Sep 2018
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for
  Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and
  Momentum-Based Acceleration
Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration
Xuefeng Gao
Mert Gurbuzbalaban
Lingjiong Zhu
105
60
0
12 Sep 2018
On the Generalization of Stochastic Gradient Descent with Momentum
On the Generalization of Stochastic Gradient Descent with Momentum
Ali Ramezani-Kebrya
Kimon Antonakopoulos
Volkan Cevher
Ashish Khisti
Ben Liang
MLT
73
26
0
12 Sep 2018
Decentralized Differentially Private Without-Replacement Stochastic
  Gradient Descent
Decentralized Differentially Private Without-Replacement Stochastic Gradient Descent
Richeng Jin
Xiaofan He
H. Dai
FedML
71
2
0
08 Sep 2018
A Unified Analysis of Stochastic Momentum Methods for Deep Learning
A Unified Analysis of Stochastic Momentum Methods for Deep Learning
Yan Yan
Tianbao Yang
Zhe Li
Qihang Lin
Yi Yang
59
120
0
30 Aug 2018
Lipschitz regularized Deep Neural Networks generalize and are
  adversarially robust
Lipschitz regularized Deep Neural Networks generalize and are adversarially robust
Chris Finlay
Jeff Calder
Bilal Abbasi
Adam M. Oberman
106
55
0
28 Aug 2018
Universal Stagewise Learning for Non-Convex Problems with Convergence on
  Averaged Solutions
Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions
Zaiyi Chen
Zhuoning Yuan
Jinfeng Yi
Bowen Zhou
Enhong Chen
Tianbao Yang
76
58
0
20 Aug 2018
Understanding training and generalization in deep learning by Fourier
  analysis
Understanding training and generalization in deep learning by Fourier analysis
Zhi-Qin John Xu
AI4CE
93
94
0
13 Aug 2018
Generalization Error in Deep Learning
Generalization Error in Deep Learning
Daniel Jakubovitz
Raja Giryes
M. Rodrigues
AI4CE
244
111
0
03 Aug 2018
Machine Learning with Membership Privacy using Adversarial
  Regularization
Machine Learning with Membership Privacy using Adversarial Regularization
Milad Nasr
Reza Shokri
Amir Houmansadr
FedMLMIACV
87
477
0
16 Jul 2018
Training behavior of deep neural network in frequency domain
Training behavior of deep neural network in frequency domain
Zhi-Qin John Xu
Yaoyu Zhang
Yan Xiao
AI4CE
99
320
0
03 Jul 2018
Theory IIIb: Generalization in Deep Networks
Theory IIIb: Generalization in Deep Networks
T. Poggio
Q. Liao
Alycia Lee
Andrzej Banburski
Xavier Boix
Jack Hidary
ODLAI4CE
104
26
0
29 Jun 2018
Laplacian Smoothing Gradient Descent
Laplacian Smoothing Gradient Descent
Stanley Osher
Bao Wang
Penghang Yin
Xiyang Luo
Farzin Barekat
Minh Pham
A. Lin
ODL
113
43
0
17 Jun 2018
Stochastic Gradient Descent with Exponential Convergence Rates of
  Expected Classification Errors
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors
Atsushi Nitanda
Taiji Suzuki
77
10
0
14 Jun 2018
Improving Regression Performance with Distributional Losses
Improving Regression Performance with Distributional Losses
Ehsan Imani
Martha White
UQCV
74
67
0
12 Jun 2018
PAC-Bayes Control: Learning Policies that Provably Generalize to Novel
  Environments
PAC-Bayes Control: Learning Policies that Provably Generalize to Novel Environments
Anirudha Majumdar
M. Goldstein
Anoopkumar Sonar
104
18
0
11 Jun 2018
Training Faster by Separating Modes of Variation in Batch-normalized
  Models
Training Faster by Separating Modes of Variation in Batch-normalized Models
Mahdi M. Kalayeh
M. Shah
70
43
0
07 Jun 2018
Implicit regularization and solution uniqueness in over-parameterized
  matrix sensing
Implicit regularization and solution uniqueness in over-parameterized matrix sensing
Kelly Geyer
Anastasios Kyrillidis
A. Kalev
126
4
0
06 Jun 2018
Representational Power of ReLU Networks and Polynomial Kernels: Beyond
  Worst-Case Analysis
Representational Power of ReLU Networks and Polynomial Kernels: Beyond Worst-Case Analysis
Frederic Koehler
Andrej Risteski
47
12
0
29 May 2018
Stable Recurrent Models
Stable Recurrent Models
John Miller
Moritz Hardt
99
119
0
25 May 2018
How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?
How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?
Richard Y. Zhang
C. Josz
Somayeh Sojoudi
Javad Lavaei
70
42
0
25 May 2018
Statistical Optimality of Stochastic Gradient Descent on Hard Learning
  Problems through Multiple Passes
Statistical Optimality of Stochastic Gradient Descent on Hard Learning Problems through Multiple Passes
Loucas Pillaud-Vivien
Alessandro Rudi
Francis R. Bach
187
103
0
25 May 2018
Deep learning generalizes because the parameter-function map is biased
  towards simple functions
Deep learning generalizes because the parameter-function map is biased towards simple functions
Guillermo Valle Pérez
Chico Q. Camargo
A. Louis
MLTAI4CE
124
232
0
22 May 2018
Measuring and regularizing networks in function space
Measuring and regularizing networks in function space
Ari S. Benjamin
David Rolnick
Konrad Paul Kording
87
140
0
21 May 2018
Stochastic modified equations for the asynchronous stochastic gradient
  descent
Stochastic modified equations for the asynchronous stochastic gradient descent
Jing An
Jian-wei Lu
Lexing Ying
77
79
0
21 May 2018
Constrained-CNN losses for weakly supervised segmentation
Constrained-CNN losses for weakly supervised segmentation
H. Kervadec
Jose Dolz
Meng Tang
Eric Granger
Yuri Boykov
Ismail Ben Ayed
92
240
0
12 May 2018
SaaS: Speed as a Supervisor for Semi-supervised Learning
SaaS: Speed as a Supervisor for Semi-supervised Learning
Safa Cicek
Alhussein Fawzi
Stefano Soatto
BDL
85
19
0
02 May 2018
Stability of the Stochastic Gradient Method for an Approximated Large
  Scale Kernel Machine
Stability of the Stochastic Gradient Method for an Approximated Large Scale Kernel Machine
A. Samareh
Mahshid Salemi Parizi
13
0
0
21 Apr 2018
A Study on Overfitting in Deep Reinforcement Learning
A Study on Overfitting in Deep Reinforcement Learning
Chiyuan Zhang
Oriol Vinyals
Rémi Munos
Samy Bengio
OffRLOnRL
63
391
0
18 Apr 2018
Regularisation of Neural Networks by Enforcing Lipschitz Continuity
Regularisation of Neural Networks by Enforcing Lipschitz Continuity
Henry Gouk
E. Frank
Bernhard Pfahringer
M. Cree
208
485
0
12 Apr 2018
Stability and Convergence Trade-off of Iterative Optimization Algorithms
Stability and Convergence Trade-off of Iterative Optimization Algorithms
Yuansi Chen
Chi Jin
Bin Yu
71
93
0
04 Apr 2018
Privacy-preserving Prediction
Privacy-preserving Prediction
Cynthia Dwork
Vitaly Feldman
82
91
0
27 Mar 2018
Gradient Descent Quantizes ReLU Network Features
Gradient Descent Quantizes ReLU Network Features
Hartmut Maennel
Olivier Bousquet
Sylvain Gelly
MLT
74
82
0
22 Mar 2018
Robust Blind Deconvolution via Mirror Descent
Robust Blind Deconvolution via Mirror Descent
Sathya Ravi
Ronak R. Mehta
Vikas Singh
26
3
0
21 Mar 2018
Constrained Deep Learning using Conditional Gradient and Applications in
  Computer Vision
Constrained Deep Learning using Conditional Gradient and Applications in Computer Vision
Sathya Ravi
Tuan Dinh
Vishnu Suresh Lokhande
Vikas Singh
AI4CE
71
22
0
17 Mar 2018
Model-Agnostic Private Learning via Stability
Model-Agnostic Private Learning via Stability
Raef Bassily
Om Thakkar
Abhradeep Thakurta
FedML
41
12
0
14 Mar 2018
On the Power of Over-parametrization in Neural Networks with Quadratic
  Activation
On the Power of Over-parametrization in Neural Networks with Quadratic Activation
S. Du
Jason D. Lee
188
272
0
03 Mar 2018
A Walk with SGD
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
102
119
0
24 Feb 2018
On the Connection Between Learning Two-Layers Neural Networks and Tensor
  Decomposition
On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition
Marco Mondelli
Andrea Montanari
MLTCML
96
59
0
20 Feb 2018
Generalization Error Bounds with Probabilistic Guarantee for SGD in
  Nonconvex Optimization
Generalization Error Bounds with Probabilistic Guarantee for SGD in Nonconvex Optimization
Yi Zhou
Yingbin Liang
Huishuai Zhang
MLT
87
26
0
19 Feb 2018
An analysis of training and generalization errors in shallow and deep
  networks
An analysis of training and generalization errors in shallow and deep networks
H. Mhaskar
T. Poggio
UQCV
59
18
0
17 Feb 2018
An Alternative View: When Does SGD Escape Local Minima?
An Alternative View: When Does SGD Escape Local Minima?
Robert D. Kleinberg
Yuanzhi Li
Yang Yuan
MLT
101
317
0
17 Feb 2018
A Progressive Batching L-BFGS Method for Machine Learning
A Progressive Batching L-BFGS Method for Machine Learning
Raghu Bollapragada
Dheevatsa Mudigere
J. Nocedal
Hao-Jun Michael Shi
P. T. P. Tang
ODL
114
153
0
15 Feb 2018
Stronger generalization bounds for deep nets via a compression approach
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLTAI4CE
187
643
0
14 Feb 2018
Towards Understanding the Generalization Bias of Two Layer Convolutional
  Linear Classifiers with Gradient Descent
Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent
Yifan Wu
Barnabás Póczós
Aarti Singh
MLT
46
8
0
13 Feb 2018
Previous
123...11121314
Next