Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.07365
Cited By
Parallel SGD: When does averaging help?
23 June 2016
Jian Zhang
Christopher De Sa
Ioannis Mitliagkas
Christopher Ré
MoMe
FedML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Parallel SGD: When does averaging help?"
50 / 67 papers shown
Title
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
Jialiang Cheng
Ning Gao
Yun Yue
Zhiling Ye
Jiadi Jiang
Jian Sha
OffRL
77
0
0
10 Dec 2024
FedAQ: Communication-Efficient Federated Edge Learning via Joint Uplink and Downlink Adaptive Quantization
Linping Qu
Shenghui Song
Chi-Ying Tsui
MQ
FedML
18
4
0
26 Jun 2024
Communication-Efficient Distributed Deep Learning via Federated Dynamic Averaging
Michail Theologitis
Georgios Frangias
Georgios Anestis
V. Samoladas
Antonios Deligiannakis
FedML
37
0
0
31 May 2024
The Limits and Potentials of Local SGD for Distributed Heterogeneous Learning with Intermittent Communication
Kumar Kshitij Patel
Margalit Glasgow
Ali Zindari
Lingxiao Wang
Sebastian U. Stich
Ziheng Cheng
Nirmit Joshi
Nathan Srebro
48
6
0
19 May 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
35
10
0
26 Feb 2024
CO2: Efficient Distributed Training with Full Communication-Computation Overlap
Weigao Sun
Zhen Qin
Weixuan Sun
Shidi Li
Dong Li
Xuyang Shen
Yu Qiao
Yiran Zhong
OffRL
58
10
0
29 Jan 2024
Asynchronous Local-SGD Training for Language Modeling
Bo Liu
Rachita Chhaparia
Arthur Douillard
Satyen Kale
Andrei A. Rusu
Jiajun Shen
Arthur Szlam
MarcÁurelio Ranzato
FedML
40
10
0
17 Jan 2024
Mini-batch Gradient Descent with Buffer
Haobo Qi
Du Huang
Yingqiu Zhu
Danyang Huang
Hansheng Wang
23
1
0
14 Dec 2023
Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat
Erdong Hu
Yu-Shuen Tang
Anastasios Kyrillidis
C. Jermaine
FedML
33
10
0
06 Sep 2023
FedDec: Peer-to-peer Aided Federated Learning
Marina Costantini
Giovanni Neglia
T. Spyropoulos
FedML
12
1
0
11 Jun 2023
Understanding and Improving Model Averaging in Federated Learning on Heterogeneous Data
Tailin Zhou
Zehong Lin
Jinchao Zhang
Danny H. K. Tsang
MoMe
FedML
35
12
0
13 May 2023
Hierarchical Weight Averaging for Deep Neural Networks
Xiaozhe Gu
Zixun Zhang
Yuncheng Jiang
Tao Luo
Ruimao Zhang
Shuguang Cui
Zhuguo Li
21
5
0
23 Apr 2023
Accelerating Hybrid Federated Learning Convergence under Partial Participation
Jieming Bian
Lei Wang
Kun Yang
Cong Shen
Jie Xu
FedML
20
11
0
10 Apr 2023
ABS: Adaptive Bounded Staleness Converges Faster and Communicates Less
Qiao Tan
Feng Zhu
Jingjing Zhang
47
0
0
21 Jan 2023
On the Performance of Gradient Tracking with Local Updates
Edward Duc Hien Nguyen
Sulaiman A. Alghunaim
Kun Yuan
César A. Uribe
37
19
0
10 Oct 2022
Parallel and Streaming Wavelet Neural Networks for Classification and Regression under Apache Spark
E Venkatesh
Yelleti Vivek
V. Ravi
Shiva Shankar Orsu
11
6
0
07 Sep 2022
Distributed Evolution Strategies for Black-box Stochastic Optimization
Xiaoyu He
Zibin Zheng
Chuan Chen
Yuren Zhou
Chuan Luo
Qingwei Lin
24
4
0
09 Apr 2022
Scaling the Wild: Decentralizing Hogwild!-style Shared-memory SGD
Bapi Chatterjee
Vyacheslav Kungurtsev
Dan Alistarh
FedML
19
2
0
13 Mar 2022
ProxSkip: Yes! Local Gradient Steps Provably Lead to Communication Acceleration! Finally!
Konstantin Mishchenko
Grigory Malinovsky
Sebastian U. Stich
Peter Richtárik
11
149
0
18 Feb 2022
On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons
Fangshuo Liao
Anastasios Kyrillidis
38
16
0
05 Dec 2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He
Fuzhao Xue
Xiaozhe Ren
Yang You
27
14
0
01 Nov 2021
Trade-offs of Local SGD at Scale: An Empirical Study
Jose Javier Gonzalez Ortiz
Jonathan Frankle
Michael G. Rabbat
Ari S. Morcos
Nicolas Ballas
FedML
37
19
0
15 Oct 2021
Local SGD Optimizes Overparameterized Neural Networks in Polynomial Time
Yuyang Deng
Mohammad Mahdi Kamani
M. Mahdavi
FedML
11
14
0
22 Jul 2021
ResIST: Layer-Wise Decomposition of ResNets for Distributed Training
Chen Dun
Cameron R. Wolfe
C. Jermaine
Anastasios Kyrillidis
16
21
0
02 Jul 2021
Communication-efficient SGD: From Local SGD to One-Shot Averaging
Artin Spiridonoff
Alexander Olshevsky
I. Paschalidis
FedML
29
20
0
09 Jun 2021
Accelerating Gossip SGD with Periodic Global Averaging
Yiming Chen
Kun Yuan
Yingya Zhang
Pan Pan
Yinghui Xu
W. Yin
29
41
0
19 May 2021
CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner
Cheng Luo
L. Qu
Youshan Miao
Peng Cheng
Y. Xiong
16
0
0
14 Mar 2021
FedDR -- Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization
Quoc Tran-Dinh
Nhan H. Pham
Dzung Phan
Lam M. Nguyen
FedML
16
39
0
05 Mar 2021
Oscars: Adaptive Semi-Synchronous Parallel Model for Distributed Deep Learning with Global View
Sheng-Jun Huang
16
0
0
17 Feb 2021
Truly Sparse Neural Networks at Scale
Selima Curci
D. Mocanu
Mykola Pechenizkiy
30
19
0
02 Feb 2021
FedSKETCH: Communication-Efficient and Private Federated Learning via Sketching
Farzin Haddadpour
Belhal Karimi
Ping Li
Xiaoyun Li
FedML
50
31
0
11 Aug 2020
Multi-Level Local SGD for Heterogeneous Hierarchical Networks
Timothy Castiglia
Anirban Das
S. Patterson
18
13
0
27 Jul 2020
DBS: Dynamic Batch Size For Distributed Deep Neural Network Training
Qing Ye
Yuhao Zhou
Mingjia Shi
Yanan Sun
Jiancheng Lv
14
11
0
23 Jul 2020
Adaptive Periodic Averaging: A Practical Approach to Reducing Communication in Distributed Learning
Peng Jiang
G. Agrawal
25
5
0
13 Jul 2020
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
Hongyi Wang
Kartik K. Sreenivasan
Shashank Rajput
Harit Vishwakarma
Saurabh Agarwal
Jy-yong Sohn
Kangwook Lee
Dimitris Papailiopoulos
FedML
18
589
0
09 Jul 2020
Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Farzin Haddadpour
Mohammad Mahdi Kamani
Aryan Mokhtari
M. Mahdavi
FedML
33
271
0
02 Jul 2020
STL-SGD: Speeding Up Local SGD with Stagewise Communication Period
Shuheng Shen
Yifei Cheng
Jingchang Liu
Linli Xu
LRM
18
7
0
11 Jun 2020
Minibatch vs Local SGD for Heterogeneous Distributed Learning
Blake E. Woodworth
Kumar Kshitij Patel
Nathan Srebro
FedML
22
198
0
08 Jun 2020
Local SGD With a Communication Overhead Depending Only on the Number of Workers
Artin Spiridonoff
Alexander Olshevsky
I. Paschalidis
FedML
19
19
0
03 Jun 2020
Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning
Pengzhan Guo
Zeyang Ye
Keli Xiao
Wei Zhu
16
14
0
07 Apr 2020
Differentially Private Federated Learning for Resource-Constrained Internet of Things
Rui Hu
Yuanxiong Guo
E. Ratazzi
Yanmin Gong
FedML
25
17
0
28 Mar 2020
A Hybrid-Order Distributed SGD Method for Non-Convex Optimization to Balance Communication Overhead, Computational Complexity, and Convergence Rate
Naeimeh Omidvar
M. Maddah-ali
Hamed Mahdavi
ODL
17
3
0
27 Mar 2020
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Zhenheng Tang
S. Shi
Wei Wang
Bo-wen Li
Xiaowen Chu
21
48
0
10 Mar 2020
Communication optimization strategies for distributed deep neural network training: A survey
Shuo Ouyang
Dezun Dong
Yemao Xu
Liquan Xiao
27
12
0
06 Mar 2020
Is Local SGD Better than Minibatch SGD?
Blake E. Woodworth
Kumar Kshitij Patel
Sebastian U. Stich
Zhen Dai
Brian Bullins
H. B. McMahan
Ohad Shamir
Nathan Srebro
FedML
34
253
0
18 Feb 2020
Stochastic Weight Averaging in Parallel: Large-Batch Training that Generalizes Well
Vipul Gupta
S. Serrano
D. DeCoste
MoMe
32
55
0
07 Jan 2020
Parallel Restarted SPIDER -- Communication Efficient Distributed Nonconvex Optimization with Optimal Computation Complexity
Pranay Sharma
Swatantra Kafle
Prashant Khanduri
Saikiran Bulusu
K. Rajawat
P. Varshney
FedML
25
17
0
12 Dec 2019
On the Convergence of Local Descent Methods in Federated Learning
Farzin Haddadpour
M. Mahdavi
FedML
19
266
0
31 Oct 2019
Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization
Farzin Haddadpour
Mohammad Mahdi Kamani
M. Mahdavi
V. Cadambe
FedML
27
199
0
30 Oct 2019
Sparsification as a Remedy for Staleness in Distributed Asynchronous SGD
Rosa Candela
Giulio Franzese
Maurizio Filippone
Pietro Michiardi
15
1
0
21 Oct 2019
1
2
Next