Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1808.07217
Cited By
Don't Use Large Mini-Batches, Use Local SGD
22 August 2018
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Use Large Mini-Batches, Use Local SGD"
50 / 271 papers shown
Title
Sharp Gaussian approximations for Decentralized Federated Learning
Soham Bonnerjee
Sayar Karmakar
W. Wu
FedML
24
0
0
12 May 2025
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Hiroki Naganuma
Xinzhi Zhang
Man-Chung Yue
Ioannis Mitliagkas
Philipp A. Witte
Russell J. Hewett
Yin Tat Lee
63
0
0
25 Apr 2025
Federated Learning for Medical Image Classification: A Comprehensive Benchmark
Zhekai Zhou
Guibo Luo
Mingzhi Chen
Zhenyu Weng
Yuesheng Zhu
FedML
26
0
0
07 Apr 2025
Convergence Analysis of Federated Learning Methods Using Backward Error Analysis
Jinwoo Lim
Suhyun Kim
Soo-Mook Moon
FedML
55
0
0
05 Mar 2025
Tackling Feature and Sample Heterogeneity in Decentralized Multi-Task Learning: A Sheaf-Theoretic Approach
Chaouki Ben Issaid
Praneeth Vepakomma
Mehdi Bennis
78
0
0
03 Feb 2025
FedSat: A Statistical Aggregation Approach for Class Imbalanced Clients in Federated Learning
S. Chowdhury
Raju Halder
FedML
35
0
0
31 Dec 2024
A Unified Analysis of Federated Learning with Arbitrary Client Participation
Shiqiang Wang
Mingyue Ji
FedML
37
55
0
31 Dec 2024
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Jialiang Cheng
Ning Gao
Yun Yue
Zhiling Ye
Jiadi Jiang
Jian Sha
OffRL
77
0
0
10 Dec 2024
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Zhixu Tao
I. Mason
Sanjeev R. Kulkarni
Xavier Boix
MoMe
FedML
84
3
0
27 Nov 2024
Distributed Sign Momentum with Local Steps for Training Transformers
Shuhua Yu
Ding Zhou
Cong Xie
An Xu
Zhi-Li Zhang
Xin Liu
S. Kar
64
0
0
26 Nov 2024
Photon: Federated LLM Pre-Training
Lorenzo Sani
Alex Iacob
Zeyu Cao
Royson Lee
Bill Marino
...
Dongqi Cai
Zexi Li
Wanru Zhao
Xinchi Qiu
Nicholas D. Lane
AI4CE
26
7
0
05 Nov 2024
Enhancing Federated Learning Convergence with Dynamic Data Queue and Data Entropy-driven Participant Selection
Charuka Herath
Xiaolan Liu
S. Lambotharan
Y. Rahulamathavan
FedML
23
2
0
23 Oct 2024
SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
Jinda Jia
Cong Xie
Hanlin Lu
Daoce Wang
Hao Feng
...
Baixi Sun
Haibin Lin
Zhi-Li Zhang
Xin Liu
Dingwen Tao
MQ
25
4
0
20 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
139
0
0
07 Oct 2024
Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning?
Peizhong Ju
Haibo Yang
Jia Liu
Yingbin Liang
Ness B. Shroff
FedML
29
0
0
05 Sep 2024
FADAS: Towards Federated Adaptive Asynchronous Optimization
Yujia Wang
Shiqiang Wang
Songtao Lu
Jinghui Chen
FedML
34
3
0
25 Jul 2024
A New Theoretical Perspective on Data Heterogeneity in Federated Optimization
Jiayi Wang
Shiqiang Wang
Rong-Rong Chen
Mingyue Ji
FedML
28
1
0
22 Jul 2024
Personalized Multi-tier Federated Learning
Sourasekhar Banerjee
Ali Dadras
A. Yurtsever
Monowar Bhuyan
FedML
51
3
0
19 Jul 2024
On the Trade-off between Flatness and Optimization in Distributed Learning
Ying Cao
Zhaoxian Wu
Kun Yuan
Ali H. Sayed
36
1
0
28 Jun 2024
Communication-Efficient Adaptive Batch Size Strategies for Distributed Local Gradient Methods
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
41
1
0
20 Jun 2024
Batch-in-Batch: a new adversarial training framework for initial perturbation and sample selection
Yinting Wu
Pai Peng
Bo Cai
Le Li
.
AAML
33
0
0
06 Jun 2024
ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training
Adel Nabli
Louis Fournier
Pierre Erbacher
Louis Serrano
Eugene Belilovsky
Edouard Oyallon
FedML
46
1
0
03 Jun 2024
Communication-Efficient Distributed Deep Learning via Federated Dynamic Averaging
Michail Theologitis
Georgios Frangias
Georgios Anestis
V. Samoladas
Antonios Deligiannakis
FedML
27
0
0
31 May 2024
Full-Stack Allreduce on Multi-Rail Networks
Enda Yu
Dezun Dong
Xiangke Liao
GNN
19
0
0
28 May 2024
WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Louis Fournier
Adel Nabli
Masih Aminbeidokhti
M. Pedersoli
Eugene Belilovsky
Edouard Oyallon
MoMe
FedML
41
3
0
27 May 2024
Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing
Yongxin Guo
Lin Wang
Xiaoying Tang
Tao R. Lin
FedML
OOD
27
0
0
25 May 2024
Efficiency for Free: Ideal Data Are Transportable Representations
Peng Sun
Yi Jiang
Tao Lin
DD
41
0
0
23 May 2024
Worldwide Federated Training of Language Models
Alexandru Iacob
Lorenzo Sani
Bill Marino
Preslav Aleksandrov
William F. Shen
Nicholas D. Lane
FedML
35
2
0
23 May 2024
The Limits and Potentials of Local SGD for Distributed Heterogeneous Learning with Intermittent Communication
Kumar Kshitij Patel
Margalit Glasgow
Ali Zindari
Lingxiao Wang
Sebastian U. Stich
Ziheng Cheng
Nirmit Joshi
Nathan Srebro
44
6
0
19 May 2024
The Future of Large Language Model Pre-training is Federated
Lorenzo Sani
Alexandru Iacob
Zeyu Cao
Bill Marino
Yan Gao
...
Wanru Zhao
William F. Shen
Preslav Aleksandrov
Xinchi Qiu
Nicholas D. Lane
AI4CE
35
12
0
17 May 2024
AB-Training: A Communication-Efficient Approach for Distributed Low-Rank Learning
D. Coquelin
Katherina Flügel
Marie Weiel
Nicholas Kiefer
Muhammed Öz
Charlotte Debus
Achim Streit
Markus Goetz
37
0
0
02 May 2024
Improved Generalization Bounds for Communication Efficient Federated Learning
Peyman Gholami
H. Seferoglu
FedML
AI4CE
18
6
0
17 Apr 2024
Communication-Efficient Large-Scale Distributed Deep Learning: A Comprehensive Survey
Feng Liang
Zhen Zhang
Haifeng Lu
Victor C. M. Leung
Yanyi Guo
Xiping Hu
GNN
31
6
0
09 Apr 2024
AdaptSFL: Adaptive Split Federated Learning in Resource-constrained Edge Networks
Zhengyi Lin
Guanqiao Qu
Wei Wei
Xianhao Chen
Kin K. Leung
48
48
0
19 Mar 2024
On the Convergence of Federated Learning Algorithms without Data Similarity
Ali Beikmohammadi
Sarit Khirirat
Sindri Magnússon
FedML
33
1
0
29 Feb 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
35
10
0
26 Feb 2024
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun
Zhen Qin
Weixuan Sun
Shidi Li
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
OffRL
53
10
0
29 Jan 2024
Momentum-SAM: Sharpness Aware Minimization without Computational Overhead
Marlon Becker
Frederick Altrock
Benjamin Risse
76
5
0
22 Jan 2024
Asynchronous Local-SGD Training for Language Modeling
Bo Liu
Rachita Chhaparia
Arthur Douillard
Satyen Kale
Andrei A. Rusu
Jiajun Shen
Arthur Szlam
MarcÁurelio Ranzato
FedML
33
10
0
17 Jan 2024
On the Role of Server Momentum in Federated Learning
Jianhui Sun
Xidong Wu
Heng-Chiao Huang
Aidong Zhang
FedML
49
11
0
19 Dec 2023
Can We Learn Communication-Efficient Optimizers?
Charles-Étienne Joseph
Benjamin Thérien
A. Moudgil
Boris Knyazev
Eugene Belilovsky
26
1
0
02 Dec 2023
Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum
Riccardo Zaccone
Carlo Masone
Marco Ciccone
FedML
24
2
0
30 Nov 2023
DiLoCo: Distributed Low-Communication Training of Language Models
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
Rachita Chhaparia
Yani Donchev
A. Kuncoro
MarcÁurelio Ranzato
Arthur Szlam
Jiajun Shen
56
31
0
14 Nov 2023
A Quadratic Synchronization Rule for Distributed Deep Learning
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
51
1
0
22 Oct 2023
Federated Multi-Objective Learning
Haibo Yang
Zhuqing Liu
Jia-Wei Liu
Chaosheng Dong
Michinari Momma
FedML
23
7
0
15 Oct 2023
Revisiting Decentralized ProxSkip: Achieving Linear Speedup
Luyao Guo
Sulaiman A. Alghunaim
Kun Yuan
Laurent Condat
Jinde Cao
FedML
36
1
0
12 Oct 2023
Stability and Generalization for Minibatch SGD and Local SGD
Yunwen Lei
Tao Sun
Mingrui Liu
29
3
0
02 Oct 2023
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data
Hao Sun
Li Shen
Shi-Yong Chen
Jingwei Sun
Jing Li
Guangzhong Sun
Dacheng Tao
FedML
31
1
0
18 Sep 2023
Stochastic Gradient Descent-like relaxation is equivalent to Metropolis dynamics in discrete optimization and inference problems
Maria Chiara Angelini
A. Cavaliere
Raffaele Marino
F. Ricci-Tersenghi
53
5
0
11 Sep 2023
Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat
Erdong Hu
Yu-Shuen Tang
Anastasios Kyrillidis
C. Jermaine
FedML
28
10
0
06 Sep 2023
1
2
3
4
5
6
Next