Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1509.01240
Cited By
v1
v2 (latest)
Train faster, generalize better: Stability of stochastic gradient descent
3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Train faster, generalize better: Stability of stochastic gradient descent"
50 / 679 papers shown
Title
Train simultaneously, generalize better: Stability of gradient-based minimax learners
Farzan Farnia
Asuman Ozdaglar
73
48
0
23 Oct 2020
Feature Selection for Huge Data via Minipatch Learning
Tianyi Yao
Genevera I. Allen
38
10
0
16 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran
Behnam Neyshabur
Hanie Sedghi
OffRL
104
11
0
16 Oct 2020
Deep generative demixing: Recovering Lipschitz signals from noisy subgaussian mixtures
Aaron Berk
43
0
0
13 Oct 2020
Explaining Neural Matrix Factorization with Gradient Rollback
Carolin (Haas) Lawrence
T. Sztyler
Mathias Niepert
102
12
0
12 Oct 2020
How Does Mixup Help With Robustness and Generalization?
Linjun Zhang
Zhun Deng
Kenji Kawaguchi
Amirata Ghorbani
James Zou
AAML
110
252
0
09 Oct 2020
Learning Binary Decision Trees by Argmin Differentiation
Valentina Zantedeschi
Matt J. Kusner
Vlad Niculae
64
13
0
09 Oct 2020
Kernel regression in high dimensions: Refined analysis beyond double descent
Fanghui Liu
Zhenyu Liao
Johan A. K. Suykens
86
50
0
06 Oct 2020
Learning Optimal Representations with the Decodable Information Bottleneck
Yann Dubois
Douwe Kiela
D. Schwab
Ramakrishna Vedantam
122
43
0
27 Sep 2020
Faster Biological Gradient Descent Learning
H. Li
ODL
22
1
0
27 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
184
313
0
24 Sep 2020
Implicit Gradient Regularization
David Barrett
Benoit Dherin
104
152
0
23 Sep 2020
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
Pan Zhou
Xiaotong Yuan
50
6
0
18 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
108
167
0
07 Sep 2020
Hybrid Differentially Private Federated Learning on Vertically Partitioned Data
Chang Wang
Jian Liang
Mingkai Huang
Bing Bai
Kun Bai
Hao Li
FedML
122
39
0
06 Sep 2020
Making Coherence Out of Nothing At All: Measuring the Evolution of Gradient Alignment
S. Chatterjee
Piotr Zielinski
54
8
0
03 Aug 2020
Principles and Algorithms for Forecasting Groups of Time Series: Locality and Globality
Pablo Montero-Manso
Rob J. Hyndman
AI4TS
102
139
0
02 Aug 2020
Cross-validation Confidence Intervals for Test Error
Pierre Bayle
Alexandre Bayle
Lucas Janson
Lester W. Mackey
73
40
0
24 Jul 2020
Tighter Generalization Bounds for Iterative Differentially Private Learning Algorithms
Fengxiang He
Bohan Wang
Dacheng Tao
FedML
55
18
0
18 Jul 2020
Measurement error models: from nonparametric methods to deep neural networks
Zhirui Hu
Z. Ke
Jun S. Liu
31
4
0
15 Jul 2020
Stochastic Hamiltonian Gradient Methods for Smooth Games
Nicolas Loizou
Hugo Berard
Alexia Jolicoeur-Martineau
Pascal Vincent
Simon Lacoste-Julien
Ioannis Mitliagkas
69
50
0
08 Jul 2020
Meta-Learning with Network Pruning
Hongduan Tian
Bo Liu
Xiaotong Yuan
Qingshan Liu
61
27
0
07 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
77
10
0
30 Jun 2020
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Zeke Xie
Xinrui Wang
Huishuai Zhang
Issei Sato
Masashi Sugiyama
ODL
155
48
0
29 Jun 2020
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning
Lionel Blondé
Pablo Strasser
Alexandros Kalousis
90
22
0
28 Jun 2020
Stability Enhanced Privacy and Applications in Private Stochastic Gradient Descent
Lauren Watson
Benedek Rozemberczki
Rik Sarkar
21
1
0
25 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
72
21
0
24 Jun 2020
ByGARS: Byzantine SGD with Arbitrary Number of Attackers
Jayanth Reddy Regatti
Hao Chen
Abhishek Gupta
FedML
AAML
70
4
0
24 Jun 2020
Understanding Deep Architectures with Reasoning Layer
Xinshi Chen
Yufei Zhang
C. Reisinger
Le Song
AI4CE
127
7
0
24 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
91
83
0
20 Jun 2020
Stochastic Gradient Descent in Hilbert Scales: Smoothness, Preconditioning and Earlier Stopping
Nicole Mücke
Enrico Reiss
47
7
0
18 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
135
76
0
18 Jun 2020
Federated Accelerated Stochastic Gradient Descent
Honglin Yuan
Tengyu Ma
FedML
104
180
0
16 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
219
95
0
15 Jun 2020
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Yunwen Lei
Yiming Ying
MLT
99
129
0
15 Jun 2020
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Raef Bassily
Vitaly Feldman
Cristóbal Guzmán
Kunal Talwar
MLT
83
198
0
12 Jun 2020
Revisiting Explicit Regularization in Neural Networks for Well-Calibrated Predictive Uncertainty
Taejong Joo
U. Chung
BDL
UQCV
34
0
0
11 Jun 2020
Speedy Performance Estimation for Neural Architecture Search
Binxin Ru
Clare Lyle
Lisa Schut
M. Fil
Mark van der Wilk
Y. Gal
107
37
0
08 Jun 2020
Bayesian Neural Network via Stochastic Gradient Descent
Abhinav Sagar
UQCV
BDL
62
2
0
04 Jun 2020
Instability, Computational Efficiency and Statistical Accuracy
Nhat Ho
K. Khamaru
Raaz Dwivedi
Martin J. Wainwright
Michael I. Jordan
Bin Yu
72
20
0
22 May 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
Eduard A. Gorbunov
Marina Danilova
Alexander Gasnikov
79
123
0
21 May 2020
LALR: Theoretical and Experimental validation of Lipschitz Adaptive Learning Rate in Regression and Neural Networks
Snehanshu Saha
Tejas Prashanth
Suraj Aralihalli
Sumedh Basarkod
T. Sudarshan
S. Dhavala
35
4
0
19 May 2020
Scaling-up Distributed Processing of Data Streams for Machine Learning
M. Nokleby
Haroon Raja
W. Bajwa
69
15
0
18 May 2020
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
Vitaly Feldman
Tomer Koren
Kunal Talwar
85
211
0
10 May 2020
Stochastic batch size for adaptive regularization in deep network optimization
Kensuke Nakamura
Stefano Soatto
Byung-Woo Hong
ODL
51
6
0
14 Apr 2020
Detached Error Feedback for Distributed SGD with Random Sparsification
An Xu
Heng-Chiao Huang
71
9
0
11 Apr 2020
R-FORCE: Robust Learning for Random Recurrent Neural Networks
Yang Zheng
Eli Shlizerman
OOD
41
5
0
25 Mar 2020
A termination criterion for stochastic gradient descent for binary classification
Sina Baghal
Courtney Paquette
S. Vavasis
39
0
0
23 Mar 2020
Weak and Strong Gradient Directions: Explaining Memorization, Generalization, and Hardness of Examples at Scale
Piotr Zielinski
Shankar Krishnan
S. Chatterjee
ODL
129
2
0
16 Mar 2020
Interference and Generalization in Temporal Difference Learning
Emmanuel Bengio
Joelle Pineau
Doina Precup
88
61
0
13 Mar 2020
Previous
1
2
3
...
8
9
10
...
12
13
14
Next