ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent
v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXiv (abs)PDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown
Title
Train simultaneously, generalize better: Stability of gradient-based
  minimax learners
Train simultaneously, generalize better: Stability of gradient-based minimax learners
Farzan Farnia
Asuman Ozdaglar
73
48
0
23 Oct 2020
Feature Selection for Huge Data via Minipatch Learning
Feature Selection for Huge Data via Minipatch Learning
Tianyi Yao
Genevera I. Allen
38
10
0
16 Oct 2020
The Deep Bootstrap Framework: Good Online Learners are Good Offline
  Generalizers
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
Preetum Nakkiran
Behnam Neyshabur
Hanie Sedghi
OffRL
104
11
0
16 Oct 2020
Deep generative demixing: Recovering Lipschitz signals from noisy
  subgaussian mixtures
Deep generative demixing: Recovering Lipschitz signals from noisy subgaussian mixtures
Aaron Berk
43
0
0
13 Oct 2020
Explaining Neural Matrix Factorization with Gradient Rollback
Explaining Neural Matrix Factorization with Gradient Rollback
Carolin (Haas) Lawrence
T. Sztyler
Mathias Niepert
102
12
0
12 Oct 2020
How Does Mixup Help With Robustness and Generalization?
How Does Mixup Help With Robustness and Generalization?
Linjun Zhang
Zhun Deng
Kenji Kawaguchi
Amirata Ghorbani
James Zou
AAML
110
252
0
09 Oct 2020
Learning Binary Decision Trees by Argmin Differentiation
Learning Binary Decision Trees by Argmin Differentiation
Valentina Zantedeschi
Matt J. Kusner
Vlad Niculae
64
13
0
09 Oct 2020
Kernel regression in high dimensions: Refined analysis beyond double
  descent
Kernel regression in high dimensions: Refined analysis beyond double descent
Fanghui Liu
Zhenyu Liao
Johan A. K. Suykens
86
50
0
06 Oct 2020
Learning Optimal Representations with the Decodable Information
  Bottleneck
Learning Optimal Representations with the Decodable Information Bottleneck
Yann Dubois
Douwe Kiela
D. Schwab
Ramakrishna Vedantam
122
43
0
27 Sep 2020
Faster Biological Gradient Descent Learning
Faster Biological Gradient Descent Learning
H. Li
ODL
22
1
0
27 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural
  Networks
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
184
313
0
24 Sep 2020
Implicit Gradient Regularization
Implicit Gradient Regularization
David Barrett
Benoit Dherin
104
152
0
23 Sep 2020
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
  Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
Pan Zhou
Xiaotong Yuan
50
6
0
18 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network
  Training
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
108
167
0
07 Sep 2020
Hybrid Differentially Private Federated Learning on Vertically
  Partitioned Data
Hybrid Differentially Private Federated Learning on Vertically Partitioned Data
Chang Wang
Jian Liang
Mingkai Huang
Bing Bai
Kun Bai
Hao Li
FedML
122
39
0
06 Sep 2020
Making Coherence Out of Nothing At All: Measuring the Evolution of
  Gradient Alignment
Making Coherence Out of Nothing At All: Measuring the Evolution of Gradient Alignment
S. Chatterjee
Piotr Zielinski
54
8
0
03 Aug 2020
Principles and Algorithms for Forecasting Groups of Time Series:
  Locality and Globality
Principles and Algorithms for Forecasting Groups of Time Series: Locality and Globality
Pablo Montero-Manso
Rob J. Hyndman
AI4TS
102
139
0
02 Aug 2020
Cross-validation Confidence Intervals for Test Error
Cross-validation Confidence Intervals for Test Error
Pierre Bayle
Alexandre Bayle
Lucas Janson
Lester W. Mackey
73
40
0
24 Jul 2020
Tighter Generalization Bounds for Iterative Differentially Private
  Learning Algorithms
Tighter Generalization Bounds for Iterative Differentially Private Learning Algorithms
Fengxiang He
Bohan Wang
Dacheng Tao
FedML
55
18
0
18 Jul 2020
Measurement error models: from nonparametric methods to deep neural
  networks
Measurement error models: from nonparametric methods to deep neural networks
Zhirui Hu
Z. Ke
Jun S. Liu
31
4
0
15 Jul 2020
Stochastic Hamiltonian Gradient Methods for Smooth Games
Stochastic Hamiltonian Gradient Methods for Smooth Games
Nicolas Loizou
Hugo Berard
Alexia Jolicoeur-Martineau
Pascal Vincent
Simon Lacoste-Julien
Ioannis Mitliagkas
69
50
0
08 Jul 2020
Meta-Learning with Network Pruning
Meta-Learning with Network Pruning
Hongduan Tian
Bo Liu
Xiaotong Yuan
Qingshan Liu
61
27
0
07 Jul 2020
AdaSGD: Bridging the gap between SGD and Adam
AdaSGD: Bridging the gap between SGD and Adam
Jiaxuan Wang
Jenna Wiens
77
10
0
30 Jun 2020
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate
  and Momentum
Adaptive Inertia: Disentangling the Effects of Adaptive Learning Rate and Momentum
Zeke Xie
Xinrui Wang
Huishuai Zhang
Issei Sato
Masashi Sugiyama
ODL
155
48
0
29 Jun 2020
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial
  Imitation Learning
Lipschitzness Is All You Need To Tame Off-policy Generative Adversarial Imitation Learning
Lionel Blondé
Pablo Strasser
Alexandros Kalousis
90
22
0
28 Jun 2020
Stability Enhanced Privacy and Applications in Private Stochastic
  Gradient Descent
Stability Enhanced Privacy and Applications in Private Stochastic Gradient Descent
Lauren Watson
Benedek Rozemberczki
Rik Sarkar
21
1
0
25 Jun 2020
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes
Shuai Zheng
Yanghua Peng
Sheng Zha
Mu Li
ODL
72
21
0
24 Jun 2020
ByGARS: Byzantine SGD with Arbitrary Number of Attackers
ByGARS: Byzantine SGD with Arbitrary Number of Attackers
Jayanth Reddy Regatti
Hao Chen
Abhishek Gupta
FedMLAAML
70
4
0
24 Jun 2020
Understanding Deep Architectures with Reasoning Layer
Understanding Deep Architectures with Reasoning Layer
Xinshi Chen
Yufei Zhang
C. Reisinger
Le Song
AI4CE
127
7
0
24 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
91
83
0
20 Jun 2020
Stochastic Gradient Descent in Hilbert Scales: Smoothness,
  Preconditioning and Earlier Stopping
Stochastic Gradient Descent in Hilbert Scales: Smoothness, Preconditioning and Earlier Stopping
Nicole Mücke
Enrico Reiss
47
7
0
18 Jun 2020
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and
  Interpolation
SGD for Structured Nonconvex Functions: Learning Rates, Minibatching and Interpolation
Robert Mansel Gower
Othmane Sebbouh
Nicolas Loizou
135
76
0
18 Jun 2020
Federated Accelerated Stochastic Gradient Descent
Federated Accelerated Stochastic Gradient Descent
Honglin Yuan
Tengyu Ma
FedML
104
180
0
16 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
219
95
0
15 Jun 2020
Fine-Grained Analysis of Stability and Generalization for Stochastic
  Gradient Descent
Fine-Grained Analysis of Stability and Generalization for Stochastic Gradient Descent
Yunwen Lei
Yiming Ying
MLT
99
129
0
15 Jun 2020
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses
Raef Bassily
Vitaly Feldman
Cristóbal Guzmán
Kunal Talwar
MLT
83
198
0
12 Jun 2020
Revisiting Explicit Regularization in Neural Networks for
  Well-Calibrated Predictive Uncertainty
Revisiting Explicit Regularization in Neural Networks for Well-Calibrated Predictive Uncertainty
Taejong Joo
U. Chung
BDLUQCV
34
0
0
11 Jun 2020
Speedy Performance Estimation for Neural Architecture Search
Speedy Performance Estimation for Neural Architecture Search
Binxin Ru
Clare Lyle
Lisa Schut
M. Fil
Mark van der Wilk
Y. Gal
107
37
0
08 Jun 2020
Bayesian Neural Network via Stochastic Gradient Descent
Abhinav Sagar
UQCVBDL
62
2
0
04 Jun 2020
Instability, Computational Efficiency and Statistical Accuracy
Instability, Computational Efficiency and Statistical Accuracy
Nhat Ho
K. Khamaru
Raaz Dwivedi
Martin J. Wainwright
Michael I. Jordan
Bin Yu
72
20
0
22 May 2020
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient
  Clipping
Stochastic Optimization with Heavy-Tailed Noise via Accelerated Gradient Clipping
Eduard A. Gorbunov
Marina Danilova
Alexander Gasnikov
79
123
0
21 May 2020
LALR: Theoretical and Experimental validation of Lipschitz Adaptive
  Learning Rate in Regression and Neural Networks
LALR: Theoretical and Experimental validation of Lipschitz Adaptive Learning Rate in Regression and Neural Networks
Snehanshu Saha
Tejas Prashanth
Suraj Aralihalli
Sumedh Basarkod
T. Sudarshan
S. Dhavala
35
4
0
19 May 2020
Scaling-up Distributed Processing of Data Streams for Machine Learning
Scaling-up Distributed Processing of Data Streams for Machine Learning
M. Nokleby
Haroon Raja
W. Bajwa
69
15
0
18 May 2020
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
Vitaly Feldman
Tomer Koren
Kunal Talwar
85
211
0
10 May 2020
Stochastic batch size for adaptive regularization in deep network
  optimization
Stochastic batch size for adaptive regularization in deep network optimization
Kensuke Nakamura
Stefano Soatto
Byung-Woo Hong
ODL
51
6
0
14 Apr 2020
Detached Error Feedback for Distributed SGD with Random Sparsification
Detached Error Feedback for Distributed SGD with Random Sparsification
An Xu
Heng-Chiao Huang
71
9
0
11 Apr 2020
R-FORCE: Robust Learning for Random Recurrent Neural Networks
R-FORCE: Robust Learning for Random Recurrent Neural Networks
Yang Zheng
Eli Shlizerman
OOD
41
5
0
25 Mar 2020
A termination criterion for stochastic gradient descent for binary
  classification
A termination criterion for stochastic gradient descent for binary classification
Sina Baghal
Courtney Paquette
S. Vavasis
39
0
0
23 Mar 2020
Weak and Strong Gradient Directions: Explaining Memorization,
  Generalization, and Hardness of Examples at Scale
Weak and Strong Gradient Directions: Explaining Memorization, Generalization, and Hardness of Examples at Scale
Piotr Zielinski
Shankar Krishnan
S. Chatterjee
ODL
129
2
0
16 Mar 2020
Interference and Generalization in Temporal Difference Learning
Interference and Generalization in Temporal Difference Learning
Emmanuel Bengio
Joelle Pineau
Doina Precup
88
61
0
13 Mar 2020
Previous
123...8910...121314
Next