ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXivPDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 275 papers shown
Title
The Deep Bootstrap Framework: Good Online Learners are Good Offline
  Generalizers
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran
Behnam Neyshabur
Hanie Sedghi
OffRL
29
11
0
16 Oct 2020
Deep generative demixing: Recovering Lipschitz signals from noisy
  subgaussian mixtures
Deep generative demixing: Recovering Lipschitz signals from noisy subgaussian mixtures
Aaron Berk
19
0
0
13 Oct 2020
How Does Mixup Help With Robustness and Generalization?
How Does Mixup Help With Robustness and Generalization?
Linjun Zhang
Zhun Deng
Kenji Kawaguchi
Amirata Ghorbani
James Zou
AAML
47
244
0
09 Oct 2020
Learning Optimal Representations with the Decodable Information
  Bottleneck
Learning Optimal Representations with the Decodable Information Bottleneck
Yann Dubois
Douwe Kiela
D. Schwab
Ramakrishna Vedantam
31
43
0
27 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural
  Networks
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
30
306
0
24 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network
  Training
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
32
160
0
07 Sep 2020
Hybrid Differentially Private Federated Learning on Vertically
  Partitioned Data
Hybrid Differentially Private Federated Learning on Vertically Partitioned Data
Chang Wang
Jian Liang
Mingkai Huang
Bing Bai
Kun Bai
Hao Li
FedML
23
39
0
06 Sep 2020
Stochastic Hamiltonian Gradient Methods for Smooth Games
Stochastic Hamiltonian Gradient Methods for Smooth Games
Nicolas Loizou
Hugo Berard
Alexia Jolicoeur-Martineau
Pascal Vincent
Simon Lacoste-Julien
Ioannis Mitliagkas
39
50
0
08 Jul 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
34
83
0
20 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
25
74
0
18 Jun 2020
Federated Accelerated Stochastic Gradient Descent
Federated Accelerated Stochastic Gradient Descent
Honglin Yuan
Tengyu Ma
FedML
30
172
0
16 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
32
94
0
15 Jun 2020
Fine-Grained Analysis of Stability and Generalization for Stochastic
  Gradient Descent
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Yunwen Lei
Yiming Ying
MLT
40
126
0
15 Jun 2020
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Raef Bassily
Vitaly Feldman
Cristóbal Guzmán
Kunal Talwar
MLT
24
192
0
12 Jun 2020
Speedy Performance Estimation for Neural Architecture Search
Speedy Performance Estimation for Neural Architecture Search
Binxin Ru
Clare Lyle
Lisa Schut
M. Fil
Mark van der Wilk
Y. Gal
20
36
0
08 Jun 2020
Bayesian Neural Network via Stochastic Gradient Descent
Abhinav Sagar
UQCV
BDL
10
2
0
04 Jun 2020
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
Vitaly Feldman
Tomer Koren
Kunal Talwar
22
204
0
10 May 2020
Detached Error Feedback for Distributed SGD with Random Sparsification
Detached Error Feedback for Distributed SGD with Random Sparsification
An Xu
Heng-Chiao Huang
41
9
0
11 Apr 2020
Forgetting Outside the Box: Scrubbing Deep Networks of Information
  Accessible from Input-Output Observations
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations
Aditya Golatkar
Alessandro Achille
Stefano Soatto
MU
OOD
27
189
0
05 Mar 2020
Understanding Self-Training for Gradual Domain Adaptation
Understanding Self-Training for Gradual Domain Adaptation
Ananya Kumar
Tengyu Ma
Percy Liang
CLL
TTA
28
228
0
26 Feb 2020
Coherent Gradients: An Approach to Understanding Generalization in
  Gradient Descent-based Optimization
Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization
S. Chatterjee
ODL
OOD
11
51
0
25 Feb 2020
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast
  Convergence
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence
Nicolas Loizou
Sharan Vaswani
I. Laradji
Simon Lacoste-Julien
27
181
0
24 Feb 2020
Data Heterogeneity Differential Privacy: From Theory to Algorithm
Data Heterogeneity Differential Privacy: From Theory to Algorithm
Yilin Kang
Jian Li
Yong Liu
Weiping Wang
30
1
0
20 Feb 2020
Performative Prediction
Performative Prediction
Juan C. Perdomo
Tijana Zrnic
Celestine Mendler-Dünner
Moritz Hardt
39
306
0
16 Feb 2020
Statistical Learning with Conditional Value at Risk
Statistical Learning with Conditional Value at Risk
Tasuku Soma
Yuichi Yoshida
10
38
0
14 Feb 2020
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient
  Descent Exponentially Favors Flat Minima
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima
Zeke Xie
Issei Sato
Masashi Sugiyama
ODL
28
17
0
10 Feb 2020
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Reasoning About Generalization via Conditional Mutual Information
Reasoning About Generalization via Conditional Mutual Information
Thomas Steinke
Lydia Zakynthinou
23
160
0
24 Jan 2020
Large-scale Kernel Methods and Applications to Lifelong Robot Learning
Large-scale Kernel Methods and Applications to Lifelong Robot Learning
Raffaello Camoriano
39
1
0
11 Dec 2019
Distributionally Robust Neural Networks for Group Shifts: On the
  Importance of Regularization for Worst-Case Generalization
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
16
1,200
0
20 Nov 2019
Layer-Dependent Importance Sampling for Training Deep and Large Graph
  Convolutional Networks
Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
Difan Zou
Ziniu Hu
Yewen Wang
Song Jiang
Yizhou Sun
Quanquan Gu
GNN
40
278
0
17 Nov 2019
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent
  Estimates
Information-Theoretic Generalization Bounds for SGLD via Data-Dependent Estimates
Jeffrey Negrea
Mahdi Haghifam
Gintare Karolina Dziugaite
Ashish Khisti
Daniel M. Roy
FedML
115
148
0
06 Nov 2019
Sharper bounds for uniformly stable algorithms
Sharper bounds for uniformly stable algorithms
Olivier Bousquet
Yegor Klochkov
Nikita Zhivotovskiy
23
120
0
17 Oct 2019
The Implicit Regularization of Ordinary Least Squares Ensembles
The Implicit Regularization of Ordinary Least Squares Ensembles
Daniel LeJeune
Hamid Javadi
Richard G. Baraniuk
18
43
0
10 Oct 2019
Improved Sample Complexities for Deep Networks and Robust Classification
  via an All-Layer Margin
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin
Colin Wei
Tengyu Ma
AAML
OOD
38
85
0
09 Oct 2019
Partial differential equation regularization for supervised machine
  learning
Partial differential equation regularization for supervised machine learning
Jillian R. Fisher
29
2
0
03 Oct 2019
On-line Non-Convex Constrained Optimization
On-line Non-Convex Constrained Optimization
Olivier Massicot
Jakub Mareˇcek
19
13
0
16 Sep 2019
Private Stochastic Convex Optimization with Optimal Rates
Private Stochastic Convex Optimization with Optimal Rates
Raef Bassily
Vitaly Feldman
Kunal Talwar
Abhradeep Thakurta
23
237
0
27 Aug 2019
Mix and Match: An Optimistic Tree-Search Approach for Learning Models
  from Mixture Distributions
Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions
Matthew Faw
Rajat Sen
Karthikeyan Shanmugam
C. Caramanis
Sanjay Shakkottai
36
3
0
23 Jul 2019
Generalization Guarantees for Neural Networks via Harnessing the
  Low-rank Structure of the Jacobian
Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian
Samet Oymak
Zalan Fabian
Mingchen Li
Mahdi Soltanolkotabi
MLT
21
88
0
12 Jun 2019
Does Learning Require Memorization? A Short Tale about a Long Tail
Does Learning Require Memorization? A Short Tale about a Long Tail
Vitaly Feldman
TDI
61
483
0
12 Jun 2019
Importance Resampling for Off-policy Prediction
Importance Resampling for Off-policy Prediction
M. Schlegel
Wesley Chung
Daniel Graves
Jian Qian
Martha White
OffRL
14
41
0
11 Jun 2019
Implicit Regularization of Accelerated Methods in Hilbert Spaces
Implicit Regularization of Accelerated Methods in Hilbert Spaces
Nicolò Pagliana
Lorenzo Rosasco
13
18
0
30 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for
  Regression Problems
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
21
57
0
28 May 2019
SGD on Neural Networks Learns Functions of Increasing Complexity
SGD on Neural Networks Learns Functions of Increasing Complexity
Preetum Nakkiran
Gal Kaplun
Dimitris Kalimeris
Tristan Yang
Benjamin L. Edelman
Fred Zhang
Boaz Barak
MLT
28
236
0
28 May 2019
Physics-informed Autoencoders for Lyapunov-stable Fluid Flow Prediction
Physics-informed Autoencoders for Lyapunov-stable Fluid Flow Prediction
N. Benjamin Erichson
Michael Muehlebach
Michael W. Mahoney
AI4CE
PINN
17
140
0
26 May 2019
Explicitizing an Implicit Bias of the Frequency Principle in Two-layer
  Neural Networks
Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks
Tao Luo
Zhi-Qin John Xu
Yaoyu Zhang
Zheng Ma
MLT
AI4CE
39
38
0
24 May 2019
Orthogonal Deep Neural Networks
Orthogonal Deep Neural Networks
Kui Jia
Shuai Li
Yuxin Wen
Tongliang Liu
Dacheng Tao
39
132
0
15 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz
  Augmentation
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Colin Wei
Tengyu Ma
25
109
0
09 May 2019
Stochastic Iterative Hard Thresholding for Graph-structured Sparsity
  Optimization
Stochastic Iterative Hard Thresholding for Graph-structured Sparsity Optimization
Baojian Zhou
F. Chen
Yiming Ying
29
7
0
09 May 2019
Previous
123456
Next