ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent
v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXiv (abs)PDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown
Title
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs
Di He
Ajay Jaiswal
Songjun Tu
Li Shen
Ganzhao Yuan
Shiwei Liu
L. Yin
47
0
0
17 Jun 2025
Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
Yilan Chen
Zhichao Wang
Wei Huang
Andi Han
Taiji Suzuki
Arya Mazumdar
MLT
30
0
0
12 Jun 2025
Generalization Error Analysis for Attack-Free and Byzantine-Resilient Decentralized Learning with Data Heterogeneity
Haoxiang Ye
Tao Sun
Qing Ling
FedML
61
0
0
11 Jun 2025
An Adaptive Method Stabilizing Activations for Enhanced Generalization
Hyunseok Seung
Jaewoo Lee
Hyunsuk Ko
ODL
32
0
0
10 Jun 2025
Optimal Rates in Continual Linear Regression via Increasing Regularization
Optimal Rates in Continual Linear Regression via Increasing Regularization
Ran Levinstein
Amit Attia
Matan Schliserman
Uri Sherman
Tomer Koren
Daniel Soudry
Itay Evron
CLL
34
0
0
06 Jun 2025
Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States
Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States
Eli Chien
Wei-Ning Chen
P. Li
33
0
0
30 May 2025
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
Towards Understanding The Calibration Benefits of Sharpness-Aware Minimization
C. Tan
Yubo Zhou
Haishan Ye
Guang Dai
Junmin Liu
Zengjie Song
Jiangshe Zhang
Zixiang Zhao
Yunda Hao
Yong Xu
AAML
44
0
0
29 May 2025
Faithful Group Shapley Value
Kiljae Lee
Ziqi Liu
Weijing Tang
Yuan Zhang
TDIFedML
147
0
0
25 May 2025
Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes
Temperature is All You Need for Generalization in Langevin Dynamics and other Markov Processes
I. Harel
Yonathan Wolanowsky
Gal Vardi
Nathan Srebro
Daniel Soudry
AI4CE
86
0
0
25 May 2025
Approach to Finding a Robust Deep Learning Model
Approach to Finding a Robust Deep Learning Model
Alexey Boldyrev
Fedor Ratnikov
Andrey Shevelev
OOD
200
0
0
22 May 2025
Convergence of Adam in Deep ReLU Networks via Directional Complexity and Kakeya Bounds
Convergence of Adam in Deep ReLU Networks via Directional Complexity and Kakeya Bounds
Anupama Sridhar
Alexander Johansen
80
0
0
21 May 2025
Online Learning and Unlearning
Online Learning and Unlearning
Yaxi Hu
Bernhard Schölkopf
Amartya Sanyal
MUOnRL
102
0
0
13 May 2025
Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization
Rapid Overfitting of Multi-Pass Stochastic Gradient Descent in Stochastic Convex Optimization
Shira Vansover-Hager
Tomer Koren
Roi Livni
75
0
0
13 May 2025
Stability Regularized Cross-Validation
Stability Regularized Cross-Validation
Ryan Cory-Wright
A. Gómez
52
0
0
11 May 2025
Gradient Descent as a Shrinkage Operator for Spectral Bias
Gradient Descent as a Shrinkage Operator for Spectral Bias
Simon Lucey
87
0
0
25 Apr 2025
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
NeuralGrok: Accelerate Grokking by Neural Gradient Transformation
Xinyu Zhou
Simin Fan
Martin Jaggi
Jie Fu
71
0
0
24 Apr 2025
Leave-One-Out Stable Conformal Prediction
Leave-One-Out Stable Conformal Prediction
Kiljae Lee
Yuan Zhang
234
0
0
16 Apr 2025
Randomized Pairwise Learning with Adaptive Sampling: A PAC-Bayes Analysis
Randomized Pairwise Learning with Adaptive Sampling: A PAC-Bayes Analysis
Sijia Zhou
Yunwen Lei
Ata Kabán
152
0
0
03 Apr 2025
Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Ability
Lijia Yu
Yibo Miao
Yifan Zhu
Xiao-Shan Gao
Lijun Zhang
92
0
0
06 Mar 2025
Sharpness-Aware Minimization: General Analysis and Improved Rates
Dimitris Oikonomou
Nicolas Loizou
96
1
0
04 Mar 2025
Stability-based Generalization Analysis of Randomized Coordinate Descent for Pairwise Learning
Liang Wu
Ruixi Hu
Yunwen Lei
77
0
0
03 Mar 2025
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops
Shi Fu
Yingjie Wang
Yuzhu Chen
Xinmei Tian
Dacheng Tao
93
3
0
26 Feb 2025
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
SASSHA: Sharpness-aware Adaptive Second-order Optimization with Stable Hessian Approximation
Dahun Shin
Dongyeop Lee
Jinseok Chung
Namhoon Lee
ODLAAML
517
0
0
25 Feb 2025
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
Zixuan Gong
Xiaolin Hu
Huayi Tang
Yong Liu
152
0
0
24 Feb 2025
Learning Variational Inequalities from Data: Fast Generalization Rates under Strong Monotonicity
Learning Variational Inequalities from Data: Fast Generalization Rates under Strong Monotonicity
Eric Zhao
Tatjana Chavdarova
Michael I. Jordan
119
0
0
20 Feb 2025
Stability-based Generalization Bounds for Variational Inference
Stability-based Generalization Bounds for Variational Inference
Yadi Wei
Roni Khardon
BDL
85
0
0
17 Feb 2025
Understanding the Generalization Error of Markov algorithms through Poissonization
Understanding the Generalization Error of Markov algorithms through Poissonization
Benjamin Dupuis
Maxime Haddouche
George Deligiannidis
Umut Simsekli
101
0
0
11 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
112
1
0
09 Feb 2025
Stability and Generalization of Quantum Neural Networks
Stability and Generalization of Quantum Neural Networks
Jiaqi Yang
Wei Xie
Xiaohua Xu
95
1
0
22 Jan 2025
Stability and Generalization in Free Adversarial Training
Stability and Generalization in Free Adversarial Training
Xiwei Cheng
Kexin Fu
Farzan Farnia
AAML
86
3
0
08 Jan 2025
Preserving Deep Representations In One-Shot Pruning: A Hessian-Free
  Second-Order Optimization Framework
Preserving Deep Representations In One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework
Ryan Lucas
Rahul Mazumder
127
0
0
27 Nov 2024
Exploring the Generalization Capabilities of AID-based Bi-level
  Optimization
Exploring the Generalization Capabilities of AID-based Bi-level Optimization
C. L. Philip Chen
Li Shen
Zhiqiang Xu
Wen Liu
Zhi-Quan Luo
Peilin Zhao
127
1
0
25 Nov 2024
Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization
Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization
Dun Zeng
Zheshun Wu
Shiyu Liu
Yu Pan
Xiaoying Tang
Zenglin Xu
MLTFedML
168
1
0
25 Nov 2024
Understanding Generalization in Quantum Machine Learning with Margins
Understanding Generalization in Quantum Machine Learning with Margins
Tak Hur
Daniel K. Park
AI4CE
68
1
0
11 Nov 2024
Generalizability of Memorization Neural Networks
Generalizability of Memorization Neural Networks
Lijia Yu
Xiao-Shan Gao
Lijun Zhang
Yibo Miao
101
1
0
01 Nov 2024
Faster Algorithms for User-Level Private Stochastic Convex Optimization
Faster Algorithms for User-Level Private Stochastic Convex Optimization
Andrew Lowy
Daogao Liu
Hilal Asi
60
1
0
24 Oct 2024
Rethinking generalization of classifiers in separable classes scenarios
  and over-parameterized regimes
Rethinking generalization of classifiers in separable classes scenarios and over-parameterized regimes
Julius Martinetz
C. Linse
Thomas Martinetz
89
0
0
22 Oct 2024
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Nonlinear Stochastic Gradient Descent and Heavy-tailed Noise: A Unified Framework and High-probability Guarantees
Aleksandar Armacki
Shuhua Yu
Pranay Sharma
Gauri Joshi
Dragana Bajović
D. Jakovetić
S. Kar
119
2
0
17 Oct 2024
Sharper Guarantees for Learning Neural Network Classifiers with Gradient
  Methods
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Hossein Taheri
Christos Thrampoulidis
Arya Mazumdar
MLT
126
0
0
13 Oct 2024
Stability and Sharper Risk Bounds with Convergence Rate $O(1/n^2)$
Stability and Sharper Risk Bounds with Convergence Rate O(1/n2)O(1/n^2)O(1/n2)
Bowei Zhu
Shaojie Li
Yong Liu
55
0
0
13 Oct 2024
Deeper Insights into Deep Graph Convolutional Networks: Stability and
  Generalization
Deeper Insights into Deep Graph Convolutional Networks: Stability and Generalization
Guangrui Yang
Ming Li
Han Feng
Xiaosheng Zhuang
GNNOODBDL
73
2
0
11 Oct 2024
Boosting the Performance of Decentralized Federated Learning via
  Catalyst Acceleration
Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration
Qinglun Li
Miao Zhang
Yingqi Liu
Quanjun Yin
Li Shen
Xiaochun Cao
FedML
94
0
0
09 Oct 2024
OledFL: Unleashing the Potential of Decentralized Federated Learning via
  Opposite Lookahead Enhancement
OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement
Qinglun Li
Miao Zhang
Mengzhu Wang
Quanjun Yin
Li Shen
OODDFedML
66
0
0
09 Oct 2024
How Much Can We Forget about Data Contamination?
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
146
2
0
04 Oct 2024
A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs
A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs
Yan Sun
Li Shen
Dacheng Tao
FedML
94
0
0
27 Sep 2024
N-gram Prediction and Word Difference Representations for Language
  Modeling
N-gram Prediction and Word Difference Representations for Language Modeling
DongNyeong Heo
Daniela N. Rim
Heeyoul Choi
112
2
0
05 Sep 2024
Onboard Satellite Image Classification for Earth Observation: A Comparative Study of ViT Models
Onboard Satellite Image Classification for Earth Observation: A Comparative Study of ViT Models
Thanh-Dung Le
Vu Nguyen Ha
T. Nguyen
G. Eappen
P. Thiruvasagam
...
J. L. González-Rios
Luis M. Garces-Socarras
Symeon Chatzinotas
Juan Carlos Merlano-Duncan
Symeon Chatzinotas
79
5
0
05 Sep 2024
Bootstrap SGD: Algorithmic Stability and Robustness
Bootstrap SGD: Algorithmic Stability and Robustness
Andreas Christmann
Yunwen Lei
55
0
0
02 Sep 2024
Enabling Humanitarian Applications with Targeted Differential Privacy
Enabling Humanitarian Applications with Targeted Differential Privacy
Nitin Kohli
J. Blumenstock
68
0
0
24 Aug 2024
Deep Learning with CNNs: A Compact Holistic Tutorial with Focus on
  Supervised Regression (Preprint)
Deep Learning with CNNs: A Compact Holistic Tutorial with Focus on Supervised Regression (Preprint)
Yansel Gónzalez Tejeda
Helmut A. Mayer
SSL
33
0
0
22 Aug 2024
1234...121314
Next