Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1509.01240
Cited By
v1
v2 (latest)
Train faster, generalize better: Stability of stochastic gradient descent
3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Train faster, generalize better: Stability of stochastic gradient descent"
50 / 679 papers shown
Title
Can Implicit Bias Explain Generalization? Stochastic Convex Optimization as a Case Study
Assaf Dauber
M. Feder
Tomer Koren
Roi Livni
102
24
0
13 Mar 2020
Revisiting SGD with Increasingly Weighted Averaging: Optimization and Generalization Perspectives
Zhishuai Guo
Yan Yan
Tianbao Yang
MoMe
75
4
0
09 Mar 2020
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations
Aditya Golatkar
Alessandro Achille
Stefano Soatto
MU
OOD
179
198
0
05 Mar 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
109
38
0
26 Feb 2020
Stagewise Enlargement of Batch Size for SGD-based Learning
Shen-Yi Zhao
Yin-Peng Xie
Wu-Jun Li
47
1
0
26 Feb 2020
Understanding Self-Training for Gradual Domain Adaptation
Ananya Kumar
Tengyu Ma
Percy Liang
CLL
TTA
100
233
0
26 Feb 2020
Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization
S. Chatterjee
ODL
OOD
122
51
0
25 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
107
189
0
24 Feb 2020
De-randomized PAC-Bayes Margin Bounds: Applications to Non-convex and Non-smooth Predictors
A. Banerjee
Tiancong Chen
Yingxue Zhou
BDL
86
8
0
23 Feb 2020
On the generalization of bayesian deep nets for multi-class classification
Yossi Adi
Yaniv Nemcovsky
Alex Schwing
Tamir Hazan
BDL
UQCV
34
1
0
23 Feb 2020
Bounding the expected run-time of nonconvex optimization with early stopping
Thomas Flynn
K. Yu
A. Malik
Nicolas DÍmperio
Shinjae Yoo
71
2
0
20 Feb 2020
Data Heterogeneity Differential Privacy: From Theory to Algorithm
Yilin Kang
Jian Li
Yong Liu
Weiping Wang
76
1
0
20 Feb 2020
Performative Prediction
Juan C. Perdomo
Tijana Zrnic
Celestine Mendler-Dünner
Moritz Hardt
203
322
0
16 Feb 2020
Statistical Learning with Conditional Value at Risk
Tasuku Soma
Yuichi Yoshida
88
38
0
14 Feb 2020
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima
Zeke Xie
Issei Sato
Masashi Sugiyama
ODL
127
17
0
10 Feb 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
284
59
0
09 Feb 2020
Characterizing Structural Regularities of Labeled Data in Overparameterized Models
Ziheng Jiang
Chiyuan Zhang
Kunal Talwar
Michael C. Mozer
TDI
68
104
0
08 Feb 2020
The Statistical Complexity of Early-Stopped Mirror Descent
Tomas Vaskevicius
Varun Kanade
Patrick Rebeschini
90
23
0
01 Feb 2020
Reasoning About Generalization via Conditional Mutual Information
Thomas Steinke
Lydia Zakynthinou
179
166
0
24 Jan 2020
Understanding Why Neural Networks Generalize Well Through GSNR of Parameters
Jinlong Liu
Guo-qing Jiang
Yunzhi Bai
Ting Chen
Huayan Wang
AI4CE
150
50
0
21 Jan 2020
Generalization Bounds for High-dimensional M-estimation under Sparsity Constraint
Xiao-Tong Yuan
Ping Li
85
2
0
20 Jan 2020
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning
Kevin Maik Jablonka
D. Ongari
S. M. Moosavi
B. Smit
AI4CE
93
365
0
18 Jan 2020
Understanding Generalization in Deep Learning via Tensor Methods
Jingling Li
Yanchao Sun
Jiahao Su
Taiji Suzuki
Furong Huang
131
28
0
14 Jan 2020
Poly-time universality and limitations of deep learning
Emmanuel Abbe
Colin Sandon
62
23
0
07 Jan 2020
Large-scale Kernel Methods and Applications to Lifelong Robot Learning
Raffaello Camoriano
84
1
0
11 Dec 2019
Fantastic Generalization Measures and Where to Find Them
Yiding Jiang
Behnam Neyshabur
H. Mobahi
Dilip Krishnan
Samy Bengio
AI4CE
175
613
0
04 Dec 2019
A Generalization Theory based on Independent and Task-Identically Distributed Assumption
Guanhua Zheng
Jitao Sang
Houqiang Li
Jian Yu
Changsheng Xu
OOD
40
1
0
28 Nov 2019
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
117
1,254
0
20 Nov 2019
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
Difan Zou
Ziniu Hu
Yewen Wang
Song Jiang
Yizhou Sun
Quanquan Gu
GNN
123
286
0
17 Nov 2019
Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks
Aditya Golatkar
Alessandro Achille
Stefano Soatto
CLL
MU
114
508
0
12 Nov 2019
A Comprehensive Comparison of Machine Learning Based Methods Used in Bengali Question Classification
Afra Anika
Md. Hasibur Rahman
Salekul Islam
Abu Shafin Mohammad Mahdee Jameel
C. R. Rahman
13
2
0
08 Nov 2019
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
222
153
0
06 Nov 2019
Diametrical Risk Minimization: Theory and Computations
Matthew Norton
J. Royset
61
19
0
24 Oct 2019
Sharper bounds for uniformly stable algorithms
Olivier Bousquet
Yegor Klochkov
Nikita Zhivotovskiy
71
122
0
17 Oct 2019
The Implicit Regularization of Ordinary Least Squares Ensembles
Daniel LeJeune
Hamid Javadi
Richard G. Baraniuk
143
43
0
10 Oct 2019
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAML
OOD
105
85
0
09 Oct 2019
Partial differential equation regularization for supervised machine learning
Jillian R. Fisher
63
2
0
03 Oct 2019
Distributed SGD Generalizes Well Under Asynchrony
Jayanth Reddy Regatti
Gaurav Tendolkar
Yi Zhou
Abhishek Gupta
Yingbin Liang
FedML
39
7
0
29 Sep 2019
Randomized Iterative Methods for Linear Systems: Momentum, Inexactness and Gossip
Nicolas Loizou
75
5
0
26 Sep 2019
Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network
Taiji Suzuki
Hiroshi Abe
Tomoaki Nishimura
AI4CE
81
44
0
25 Sep 2019
On-line Non-Convex Constrained Optimization
Olivier Massicot
Jakub Mareˇcek
47
13
0
16 Sep 2019
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective
Guan-Horng Liu
Evangelos A. Theodorou
AI4CE
121
72
0
28 Aug 2019
Private Stochastic Convex Optimization with Optimal Rates
Raef Bassily
Vitaly Feldman
Kunal Talwar
Abhradeep Thakurta
94
246
0
27 Aug 2019
Path Length Bounds for Gradient Descent and Flow
Chirag Gupta
Sivaraman Balakrishnan
Aaditya Ramdas
152
15
0
02 Aug 2019
Bias of Homotopic Gradient Descent for the Hinge Loss
Denali Molitor
Deanna Needell
Rachel A. Ward
39
6
0
26 Jul 2019
Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions
Matthew Faw
Rajat Sen
Karthikeyan Shanmugam
Constantine Caramanis
Sanjay Shakkottai
83
3
0
23 Jul 2019
Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian
Samet Oymak
Zalan Fabian
Mingchen Li
Mahdi Soltanolkotabi
MLT
93
89
0
12 Jun 2019
Does Learning Require Memorization? A Short Tale about a Long Tail
Vitaly Feldman
TDI
210
504
0
12 Jun 2019
Importance Resampling for Off-policy Prediction
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
57
41
0
11 Jun 2019
Understanding Generalization through Visualizations
Wenjie Huang
Z. Emam
Micah Goldblum
Liam H. Fowl
J. K. Terry
Furong Huang
Tom Goldstein
AI4CE
72
80
0
07 Jun 2019
Previous
1
2
3
...
10
11
12
13
14
9
Next