ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1509.01240
  4. Cited By
Train faster, generalize better: Stability of stochastic gradient
  descent
v1v2 (latest)

Train faster, generalize better: Stability of stochastic gradient descent

3 September 2015
Moritz Hardt
Benjamin Recht
Y. Singer
ArXiv (abs)PDFHTML

Papers citing "Train faster, generalize better: Stability of stochastic gradient descent"

50 / 679 papers shown
Title
Flexora: Flexible Low Rank Adaptation for Large Language Models
Flexora: Flexible Low Rank Adaptation for Large Language Models
Chenxing Wei
Yao Shu
Y. He
Fei Richard Yu
AI4CE
98
4
0
20 Aug 2024
Learning Decisions Offline from Censored Observations with
  ε-insensitive Operational Costs
Learning Decisions Offline from Censored Observations with ε-insensitive Operational Costs
Minxia Chen
Ke Fu
Teng Huang
Miao Bai
OffRL
99
0
0
14 Aug 2024
Reciprocal Learning
Reciprocal Learning
Julian Rodemann
Christoph Jansen
G. Schollmeyer
FedML
78
0
0
12 Aug 2024
Dual-Channel Latent Factor Analysis Enhanced Graph Contrastive Learning
  for Recommendation
Dual-Channel Latent Factor Analysis Enhanced Graph Contrastive Learning for Recommendation
Junfeng Long
Hao Wu
66
0
0
09 Aug 2024
Distribution Learning for Molecular Regression
Distribution Learning for Molecular Regression
Nima Shoghi
Pooya Shoghi
Anuroop Sriram
Abhishek Das
OOD
130
0
0
30 Jul 2024
Generalization bounds for regression and classification on adaptive
  covering input domains
Generalization bounds for regression and classification on adaptive covering input domains
Wen-Liang Hwang
69
0
0
29 Jul 2024
Private Heterogeneous Federated Learning Without a Trusted Server
  Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex
  Losses
Private Heterogeneous Federated Learning Without a Trusted Server Revisited: Error-Optimal and Communication-Efficient Algorithms for Convex Losses
Changyu Gao
Andrew Lowy
Xingyu Zhou
Stephen J. Wright
FedML
85
5
0
12 Jul 2024
Generalization Error Matters in Decentralized Learning Under Byzantine
  Attacks
Generalization Error Matters in Decentralized Learning Under Byzantine Attacks
Haoxiang Ye
Qing Ling
71
1
0
11 Jul 2024
Curvature Clues: Decoding Deep Learning Privacy with Input Loss
  Curvature
Curvature Clues: Decoding Deep Learning Privacy with Input Loss Curvature
Deepak Ravikumar
Efstathia Soufleri
Kaushik Roy
77
0
0
03 Jul 2024
Information Guided Regularization for Fine-tuning Language Models
Information Guided Regularization for Fine-tuning Language Models
Mandar Sharma
Nikhil Muralidhar
Shengzhe Xu
Raquib Bin Yousuf
Naren Ramakrishnan
113
0
0
20 Jun 2024
How Does Distribution Matching Help Domain Generalization: An
  Information-theoretic Analysis
How Does Distribution Matching Help Domain Generalization: An Information-theoretic Analysis
Yuxin Dong
Tieliang Gong
Hong Chen
Shuangyong Song
Weizhan Zhang
Chen Li
OOD
90
1
0
14 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
168
0
0
11 Jun 2024
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization
  by Large Step Sizes
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
Dan Qiao
Kaiqi Zhang
Esha Singh
Daniel Soudry
Yu-Xiang Wang
NoLa
91
4
0
10 Jun 2024
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance
Dimitris Oikonomou
Nicolas Loizou
97
6
0
06 Jun 2024
Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality
  from Simple Reductions
Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions
Hilal Asi
Daogao Liu
Kevin Tian
76
4
0
04 Jun 2024
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Generalization Bound and New Algorithm for Clean-Label Backdoor Attack
Lijia Yu
Shuang Liu
Yibo Miao
Xiao-Shan Gao
Lijun Zhang
AAML
96
7
0
02 Jun 2024
A Novel Review of Stability Techniques for Improved Privacy-Preserving
  Machine Learning
A Novel Review of Stability Techniques for Improved Privacy-Preserving Machine Learning
Coleman DuPlessie
Aidan Gao
60
0
0
31 May 2024
RoPINN: Region Optimized Physics-Informed Neural Networks
RoPINN: Region Optimized Physics-Informed Neural Networks
Haixu Wu
Huakun Luo
Yuezhou Ma
Jianmin Wang
Mingsheng Long
AI4CE
75
9
0
23 May 2024
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow
Chen-Hao Chao
Chien Feng
Wei-Fang Sun
Cheng-Kuang Lee
Simon See
Chun-Yi Lee
88
5
0
22 May 2024
A General Theory for Compositional Generalization
A General Theory for Compositional Generalization
Jingwen Fu
Zhizheng Zhang
Yan Lu
Nanning Zheng
AI4CECoGe
70
2
0
20 May 2024
Stochastic Gradient MCMC for Massive Geostatistical Data
Stochastic Gradient MCMC for Massive Geostatistical Data
M. Abba
Brian J. Reich
Reetam Majumder
Brandon Feng
57
1
0
07 May 2024
Uniformly Stable Algorithms for Adversarial Training and Beyond
Uniformly Stable Algorithms for Adversarial Training and Beyond
Jiancong Xiao
Jiawei Zhang
Zhimin Luo
Asuman Ozdaglar
AAML
72
2
0
03 May 2024
The Sample Complexity of Gradient Descent in Stochastic Convex
  Optimization
The Sample Complexity of Gradient Descent in Stochastic Convex Optimization
Roi Livni
MLT
83
1
0
07 Apr 2024
Statistical Mechanics and Artificial Neural Networks: Principles,
  Models, and Applications
Statistical Mechanics and Artificial Neural Networks: Principles, Models, and Applications
Lucas Böttcher
Gregory R. Wheeler
86
0
0
05 Apr 2024
Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve
  Generalization Performance of Deep Classification Models
Reduced Jeffries-Matusita distance: A Novel Loss Function to Improve Generalization Performance of Deep Classification Models
M. Lashkari
Amin Gheibi
65
0
0
13 Mar 2024
Unveiling Privacy, Memorization, and Input Curvature Links
Unveiling Privacy, Memorization, and Input Curvature Links
Deepak Ravikumar
Efstathia Soufleri
Abolfazl Hashemi
Kaushik Roy
107
6
0
28 Feb 2024
From Inverse Optimization to Feasibility to ERM
From Inverse Optimization to Feasibility to ERM
Saurabh Mishra
Anant Raj
Sharan Vaswani
85
3
0
27 Feb 2024
Federated Fairness without Access to Sensitive Groups
Federated Fairness without Access to Sensitive Groups
Afroditi Papadaki
Natalia Martínez
Martín Bertrán
Guillermo Sapiro
Miguel R. D. Rodrigues
FedML
79
2
0
22 Feb 2024
Investigating the Histogram Loss in Regression
Investigating the Histogram Loss in Regression
Ehsan Imani
Kai Luedemann
Sam Scholnick-Hughes
Esraa Elelimy
Martha White
UQCV
67
6
0
20 Feb 2024
LoRA Training in the NTK Regime has No Spurious Local Minima
LoRA Training in the NTK Regime has No Spurious Local Minima
Uijeong Jang
Jason D. Lee
Ernest K. Ryu
123
17
0
19 Feb 2024
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
AdaBatchGrad: Combining Adaptive Batch Size and Adaptive Step Size
P. Ostroukhov
Aigerim Zhumabayeva
Chulu Xiang
Alexander Gasnikov
Martin Takáč
Dmitry Kamzolov
ODL
86
2
0
07 Feb 2024
Strong convexity-guided hyper-parameter optimization for flatter losses
Strong convexity-guided hyper-parameter optimization for flatter losses
Rahul Yedida
Snehanshu Saha
111
0
0
07 Feb 2024
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially
  Private Stochastic Optimisation
Subsampling is not Magic: Why Large Batch Sizes Work for Differentially Private Stochastic Optimisation
Ossi Raisa
Hibiki Ito
Antti Honkela
77
6
0
06 Feb 2024
Langevin Unlearning: A New Perspective of Noisy Gradient Descent for
  Machine Unlearning
Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning
Eli Chien
Haoyu Wang
Ziang Chen
Pan Li
MU
135
17
0
18 Jan 2024
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift
Renchunzi Xie
Ambroise Odonnat
Vasilii Feofanov
I. Redko
Jianfeng Zhang
Bo An
UQCV
152
1
0
17 Jan 2024
Stabilizing Sharpness-aware Minimization Through A Simple
  Renormalization Strategy
Stabilizing Sharpness-aware Minimization Through A Simple Renormalization Strategy
Chengli Tan
Jiangshe Zhang
Junmin Liu
Yicheng Wang
Yunda Hao
AAML
91
2
0
14 Jan 2024
Convex SGD: Generalization Without Early Stopping
Convex SGD: Generalization Without Early Stopping
Julien Hendrickx
A. Olshevsky
MLTLRM
67
1
0
08 Jan 2024
Data-Dependent Stability Analysis of Adversarial Training
Data-Dependent Stability Analysis of Adversarial Training
Yihan Wang
Shuang Liu
Xiao-Shan Gao
68
4
0
06 Jan 2024
Class-wise Generalization Error: an Information-Theoretic Analysis
Class-wise Generalization Error: an Information-Theoretic Analysis
Firas Laakom
Yuheng Bu
Moncef Gabbouj
78
1
0
05 Jan 2024
StableKD: Breaking Inter-block Optimization Entanglement for Stable
  Knowledge Distillation
StableKD: Breaking Inter-block Optimization Entanglement for Stable Knowledge Distillation
Shiu-hong Kao
Jierun Chen
S.-H. Gary Chan
75
0
0
20 Dec 2023
Density Descent for Diversity Optimization
Density Descent for Diversity Optimization
David H. Lee
Anishalakshmi V. Palaparthi
Matthew C. Fontaine
Bryon Tjanaka
Stefanos Nikolaidis
71
1
0
18 Dec 2023
SoK: Unintended Interactions among Machine Learning Defenses and Risks
SoK: Unintended Interactions among Machine Learning Defenses and Risks
Vasisht Duddu
S. Szyller
Nadarajah Asokan
AAML
169
2
0
07 Dec 2023
Optimal Sample Complexity of Contrastive Learning
Optimal Sample Complexity of Contrastive Learning
Noga Alon
Dmitrii Avdiukhin
Dor Elboim
Orr Fischer
G. Yaroslavtsev
SSL
78
7
0
01 Dec 2023
Adam-like Algorithm with Smooth Clipping Attains Global Minima: Analysis
  Based on Ergodicity of Functional SDEs
Adam-like Algorithm with Smooth Clipping Attains Global Minima: Analysis Based on Ergodicity of Functional SDEs
Keisuke Suzuki
60
0
0
29 Nov 2023
Nonparametric Teaching for Multiple Learners
Nonparametric Teaching for Multiple Learners
Chen Zhang
Xiaofeng Cao
Weiyang Liu
Ivor Tsang
James T. Kwok
68
4
0
17 Nov 2023
Using Stochastic Gradient Descent to Smooth Nonconvex Functions:
  Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization with Optimal Noise Scheduling
Naoki Sato
Hideaki Iiduka
84
3
0
15 Nov 2023
Signal Processing Meets SGD: From Momentum to Filter
Signal Processing Meets SGD: From Momentum to Filter
Zhipeng Yao
Guisong Chang
Jiaqi Zhang
Qi Zhang
Dazhou Li
Yu Zhang
ODL
109
0
0
06 Nov 2023
NeuroEvoBench: Benchmarking Evolutionary Optimizers for Deep Learning
  Applications
NeuroEvoBench: Benchmarking Evolutionary Optimizers for Deep Learning Applications
Robert Tjarko Lange
Yujin Tang
Yingtao Tian
ELM
89
3
0
04 Nov 2023
Generalization Bounds for Label Noise Stochastic Gradient Descent
Generalization Bounds for Label Noise Stochastic Gradient Descent
Jung Eun Huh
Patrick Rebeschini
71
1
0
01 Nov 2023
Initialization Matters: Privacy-Utility Analysis of Overparameterized
  Neural Networks
Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks
Jiayuan Ye
Zhenyu Zhu
Fanghui Liu
Reza Shokri
Volkan Cevher
91
13
0
31 Oct 2023
Previous
12345...121314
Next