ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent
v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXiv (abs)PDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown
Title
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization
  as a Case Study
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study
Assaf Dauber
M. Feder
Tomer Koren
Roi Livni
102
24
0
13 Mar 2020
Revisiting SGD with Increasingly Weighted Averaging: Optimization and
  Generalization Perspectives
Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives
Zhishuai Guo
Yan Yan
Tianbao Yang
MoMe
75
4
0
09 Mar 2020
Forgetting Outside the Box: Scrubbing Deep Networks of Information
  Accessible from Input-Output Observations
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations
Aditya Golatkar
Alessandro Achille
Stefano Soatto
MUOOD
179
198
0
05 Mar 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
109
38
0
26 Feb 2020
Stagewise Enlargement of Batch Size for SGD-based Learning
Stagewise Enlargement of Batch Size for SGD-based Learning
Shen-Yi Zhao
Yin-Peng Xie
Wu-Jun Li
47
1
0
26 Feb 2020
Understanding Self-Training for Gradual Domain Adaptation
Understanding Self-Training for Gradual Domain Adaptation
Ananya Kumar
Tengyu Ma
Percy Liang
CLLTTA
100
233
0
26 Feb 2020
Coherent Gradients: An Approach to Understanding Generalization in
  Gradient Descent-based Optimization
Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization
S. Chatterjee
ODLOOD
122
51
0
25 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
107
189
0
24 Feb 2020
De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and
  Non-smooth Predictors
De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors
A. Banerjee
Tiancong Chen
Yingxue Zhou
BDL
86
8
0
23 Feb 2020
On the generalization of bayesian deep nets for multi-class
  classification
On the generalization of bayesian deep nets for multi-class classification
Yossi Adi
Yaniv Nemcovsky
Alex Schwing
Tamir Hazan
BDLUQCV
34
1
0
23 Feb 2020
Bounding the expected run-time of nonconvex optimization with early
  stopping
Bounding the expected run-time of nonconvex optimization with early stopping
Thomas Flynn
K. Yu
A. Malik
Nicolas DÍmperio
Shinjae Yoo
71
2
0
20 Feb 2020
Data Heterogeneity Differential Privacy: From Theory to Algorithm
Data Heterogeneity Differential Privacy: From Theory to Algorithm
Yilin Kang
Jian Li
Yong Liu
Weiping Wang
76
1
0
20 Feb 2020
Performative Prediction
Performative Prediction
Juan C. Perdomo
Tijana Zrnic
Celestine Mendler-Dünner
Moritz Hardt
203
322
0
16 Feb 2020
Statistical Learning with Conditional Value at Risk
Statistical Learning with Conditional Value at Risk
Tasuku Soma
Yuichi Yoshida
88
38
0
14 Feb 2020
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient
  Descent Exponentially Favors Flat Minima
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima
Zeke Xie
Issei Sato
Masashi Sugiyama
ODL
127
17
0
10 Feb 2020
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
284
59
0
09 Feb 2020
Characterizing Structural Regularities of Labeled Data in
  Overparameterized Models
Characterizing Structural Regularities of Labeled Data in Overparameterized Models
Ziheng Jiang
Chiyuan Zhang
Kunal Talwar
Michael C. Mozer
TDI
68
104
0
08 Feb 2020
The Statistical Complexity of Early-Stopped Mirror Descent
The Statistical Complexity of Early-Stopped Mirror Descent
Tomas Vaskevicius
Varun Kanade
Patrick Rebeschini
90
23
0
01 Feb 2020
Reasoning About Generalization via Conditional Mutual Information
Reasoning About Generalization via Conditional Mutual Information
Thomas Steinke
Lydia Zakynthinou
179
166
0
24 Jan 2020
Understanding Why Neural Networks Generalize Well Through GSNR of
  Parameters
Understanding Why Neural Networks Generalize Well Through GSNR of Parameters
Jinlong Liu
Guo-qing Jiang
Yunzhi Bai
Ting Chen
Huayan Wang
AI4CE
150
50
0
21 Jan 2020
Generalization Bounds for High-dimensional M-estimation under Sparsity
  Constraint
Generalization Bounds for High-dimensional M-estimation under Sparsity Constraint
Xiao-Tong Yuan
Ping Li
85
2
0
20 Jan 2020
Big-Data Science in Porous Materials: Materials Genomics and Machine
  Learning
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning
Kevin Maik Jablonka
D. Ongari
S. M. Moosavi
B. Smit
AI4CE
93
365
0
18 Jan 2020
Understanding Generalization in Deep Learning via Tensor Methods
Understanding Generalization in Deep Learning via Tensor Methods
Jingling Li
Yanchao Sun
Jiahao Su
Taiji Suzuki
Furong Huang
131
28
0
14 Jan 2020
Poly-time universality and limitations of deep learning
Poly-time universality and limitations of deep learning
Emmanuel Abbe
Colin Sandon
62
23
0
07 Jan 2020
Large-scale Kernel Methods and Applications to Lifelong Robot Learning
Large-scale Kernel Methods and Applications to Lifelong Robot Learning
Raffaello Camoriano
84
1
0
11 Dec 2019
Fantastic Generalization Measures and Where to Find Them
Fantastic Generalization Measures and Where to Find Them
Yiding Jiang
Behnam Neyshabur
H. Mobahi
Dilip Krishnan
Samy Bengio
AI4CE
175
613
0
04 Dec 2019
A Generalization Theory based on Independent and Task-Identically
  Distributed Assumption
A Generalization Theory based on Independent and Task-Identically Distributed Assumption
Guanhua Zheng
Jitao Sang
Houqiang Li
Jian Yu
Changsheng Xu
OOD
40
1
0
28 Nov 2019
Distributionally Robust Neural Networks for Group Shifts: On the
  Importance of Regularization for Worst-Case Generalization
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
117
1,254
0
20 Nov 2019
Layer-Dependent Importance Sampling for Training Deep and Large Graph
  Convolutional Networks
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
Difan Zou
Ziniu Hu
Yewen Wang
Song Jiang
Yizhou Sun
Quanquan Gu
GNN
123
286
0
17 Nov 2019
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep
  Networks
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
Aditya Golatkar
Alessandro Achille
Stefano Soatto
CLLMU
114
508
0
12 Nov 2019
A Comprehensive Comparison of Machine Learning Based Methods Used in
  Bengali Question Classification
A Comprehensive Comparison of Machine Learning Based Methods Used in Bengali Question Classification
Afra Anika
Md. Hasibur Rahman
Salekul Islam
Abu Shafin Mohammad Mahdee Jameel
C. R. Rahman
13
2
0
08 Nov 2019
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent
  Estimates
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
222
153
0
06 Nov 2019
Diametrical Risk Minimization: Theory and Computations
Diametrical Risk Minimization: Theory and Computations
Matthew Norton
J. Royset
61
19
0
24 Oct 2019
Sharper bounds for uniformly stable algorithms
Sharper bounds for uniformly stable algorithms
Olivier Bousquet
Yegor Klochkov
Nikita Zhivotovskiy
71
122
0
17 Oct 2019
The Implicit Regularization of Ordinary Least Squares Ensembles
The Implicit Regularization of Ordinary Least Squares Ensembles
Daniel LeJeune
Hamid Javadi
Richard G. Baraniuk
143
43
0
10 Oct 2019
Improved Sample Complexities for Deep Networks and Robust Classification
  via an All-Layer Margin
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAMLOOD
105
85
0
09 Oct 2019
Partial differential equation regularization for supervised machine
  learning
Partial differential equation regularization for supervised machine learning
Jillian R. Fisher
63
2
0
03 Oct 2019
Distributed SGD Generalizes Well Under Asynchrony
Distributed SGD Generalizes Well Under Asynchrony
Jayanth Reddy Regatti
Gaurav Tendolkar
Yi Zhou
Abhishek Gupta
Yingbin Liang
FedML
39
7
0
29 Sep 2019
Randomized Iterative Methods for Linear Systems: Momentum, Inexactness
  and Gossip
Randomized Iterative Methods for Linear Systems: Momentum, Inexactness and Gossip
Nicolas Loizou
75
5
0
26 Sep 2019
Compression based bound for non-compressed network: unified
  generalization error analysis of large compressible deep neural network
Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network
Taiji Suzuki
Hiroshi Abe
Tomoaki Nishimura
AI4CE
81
44
0
25 Sep 2019
On-line Non-Convex Constrained Optimization
On-line Non-Convex Constrained Optimization
Olivier Massicot
Jakub Mareˇcek
47
13
0
16 Sep 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems
  Perspective
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective
Guan-Horng Liu
Evangelos A. Theodorou
AI4CE
121
72
0
28 Aug 2019
Private Stochastic Convex Optimization with Optimal Rates
Private Stochastic Convex Optimization with Optimal Rates
Raef Bassily
Vitaly Feldman
Kunal Talwar
Abhradeep Thakurta
94
246
0
27 Aug 2019
Path Length Bounds for Gradient Descent and Flow
Path Length Bounds for Gradient Descent and Flow
Chirag Gupta
Sivaraman Balakrishnan
Aaditya Ramdas
152
15
0
02 Aug 2019
Bias of Homotopic Gradient Descent for the Hinge Loss
Bias of Homotopic Gradient Descent for the Hinge Loss
Denali Molitor
Deanna Needell
Rachel A. Ward
39
6
0
26 Jul 2019
Mix and Match: An Optimistic Tree-Search Approach for Learning Models
  from Mixture Distributions
Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions
Matthew Faw
Rajat Sen
Karthikeyan Shanmugam
Constantine Caramanis
Sanjay Shakkottai
83
3
0
23 Jul 2019
Generalization Guarantees for Neural Networks via Harnessing the
  Low-rank Structure of the Jacobian
Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian
Samet Oymak
Zalan Fabian
Mingchen Li
Mahdi Soltanolkotabi
MLT
93
89
0
12 Jun 2019
Does Learning Require Memorization? A Short Tale about a Long Tail
Does Learning Require Memorization? A Short Tale about a Long Tail
Vitaly Feldman
TDI
210
504
0
12 Jun 2019
Importance Resampling for Off-policy Prediction
Importance Resampling for Off-policy Prediction
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
57
41
0
11 Jun 2019
Understanding Generalization through Visualizations
Understanding Generalization through Visualizations
Wenjie Huang
Z. Emam
Micah Goldblum
Liam H. Fowl
J. K. Terry
Furong Huang
Tom Goldstein
AI4CE
72
80
0
07 Jun 2019
Previous
123...10111213149
Next