ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.01215
  4. Cited By
Why (and When) does Local SGD Generalize Better than SGD?
v1v2 (latest)

Why (and When) does Local SGD Generalize Better than SGD?

2 March 2023
Xinran Gu
Kaifeng Lyu
Longbo Huang
Sanjeev Arora
ArXiv (abs)PDFHTML

Papers citing "Why (and When) does Local SGD Generalize Better than SGD?"

5 / 5 papers shown
Title
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Hiroki Naganuma
Xinzhi Zhang
Man-Chung Yue
Ioannis Mitliagkas
Philipp A. Witte
Russell J. Hewett
Yin Tat Lee
256
0
0
25 Apr 2025
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Ruichen Luo
Sebastian U Stich
Samuel Horváth
Martin Takáč
139
0
0
08 Jan 2025
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Jialiang Cheng
Ning Gao
Yun Yue
Zhiling Ye
Jiadi Jiang
Jian Sha
OffRL
158
1
0
10 Dec 2024
The Marginal Value of Momentum for Small Learning Rate SGD
The Marginal Value of Momentum for Small Learning Rate SGD
Runzhe Wang
Sadhika Malladi
Tianhao Wang
Kaifeng Lyu
Zhiyuan Li
ODL
82
9
0
27 Jul 2023
Decentralized SGD and Average-direction SAM are Asymptotically
  Equivalent
Decentralized SGD and Average-direction SAM are Asymptotically Equivalent
Tongtian Zhu
Fengxiang He
Kaixuan Chen
Mingli Song
Dacheng Tao
154
15
0
05 Jun 2023
1