Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2003.06307
Cited By
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
10 March 2020
Zhenheng Tang
Shaoshuai Shi
Wei Wang
Yue Liu
Xiaowen Chu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Communication-Efficient Distributed Deep Learning: A Comprehensive Survey"
35 / 35 papers shown
Title
Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays
Konstantin Mishchenko
Francis R. Bach
Mathieu Even
Blake E. Woodworth
51
59
0
15 Jun 2022
Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning
Zhenheng Tang
Yonggang Zhang
Shaoshuai Shi
Xinfu He
Bo Han
Xiaowen Chu
FedML
63
74
0
06 Jun 2022
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks
Shaoshuai Shi
Lin Zhang
Yue Liu
84
9
0
14 Jul 2021
Communication-efficient SGD: From Local SGD to One-Shot Averaging
Artin Spiridonoff
Alexander Olshevsky
I. Paschalidis
FedML
61
20
0
09 Jun 2021
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning
Divyansh Jhunjhunwala
Advait Gadhikar
Gauri Joshi
Yonina C. Eldar
FedML
MQ
38
108
0
08 Feb 2021
A Survey of Deep Learning Techniques for Neural Machine Translation
Shu Yang
Yuxin Wang
Xiaowen Chu
VLM
AI4TS
AI4CE
45
139
0
18 Feb 2020
Blink: Fast and Generic Collectives for Distributed ML
Guanhua Wang
Shivaram Venkataraman
Amar Phanishayee
J. Thelin
Nikhil R. Devanur
Ion Stoica
VLM
31
137
0
11 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
288
1,861
0
17 Sep 2019
Priority-based Parameter Propagation for Distributed DNN Training
Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
AI4CE
33
178
0
10 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
Yang You
Jing Li
Sashank J. Reddi
Jonathan Hseu
Sanjiv Kumar
Srinadh Bhojanapalli
Xiaodan Song
J. Demmel
Kurt Keutzer
Cho-Jui Hsieh
ODL
163
991
0
01 Apr 2019
Robust and Communication-Efficient Federated Learning from Non-IID Data
Felix Sattler
Simon Wiedemann
K. Müller
Wojciech Samek
FedML
44
1,343
0
07 Mar 2019
Distributed Learning with Sparse Communications by Identification
Dmitry Grishchenko
F. Iutzeler
J. Malick
Massih-Reza Amini
32
19
0
10 Dec 2018
Stochastic Gradient Push for Distributed Deep Learning
Mahmoud Assran
Nicolas Loizou
Nicolas Ballas
Michael G. Rabbat
51
342
0
27 Nov 2018
Measuring the Effects of Data Parallelism on Neural Network Training
Christopher J. Shallue
Jaehoon Lee
J. Antognini
J. Mamou
J. Ketterling
Yao Wang
68
408
0
08 Nov 2018
A Dual Approach for Optimal Algorithms in Distributed Optimization over Networks
César A. Uribe
Soomin Lee
Alexander Gasnikov
A. Nedić
29
137
0
03 Sep 2018
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms
Jianyu Wang
Gauri Joshi
101
348
0
22 Aug 2018
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning
Tianyi Chen
G. Giannakis
Tao Sun
W. Yin
51
298
0
25 May 2018
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
152
1,056
0
24 May 2018
GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent
J. Daily
Abhinav Vishnu
Charles Siegel
T. Warfel
Vinay C. Amatya
36
95
0
15 Mar 2018
TicTac: Accelerating Distributed Deep Learning with Communication Scheduling
Sayed Hadi Hashemi
Sangeetha Abdu Jyothi
R. Campbell
32
196
0
08 Mar 2018
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
Chia-Yu Chen
Jungwook Choi
D. Brand
A. Agrawal
Wei Zhang
K. Gopalakrishnan
ODL
40
173
0
07 Dec 2017
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Chengyue Wu
Song Han
Huizi Mao
Yu Wang
W. Dally
102
1,399
0
05 Dec 2017
The TensorFlow Partitioning and Scheduling Problem: It's the Critical Path!
R. Mayer
C. Mayer
Larissa Laich
38
46
0
06 Nov 2017
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era
Chen Sun
Abhinav Shrivastava
Saurabh Singh
Abhinav Gupta
VLM
108
2,386
0
10 Jul 2017
Device Placement Optimization with Reinforcement Learning
Azalia Mirhoseini
Hieu H. Pham
Quoc V. Le
Benoit Steiner
Rasmus Larsen
Yuefeng Zhou
Naveen Kumar
Mohammad Norouzi
Samy Bengio
J. Dean
59
438
0
13 Jun 2017
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Huatian Zhang
Zeyu Zheng
Shizhen Xu
Wei-Ming Dai
Qirong Ho
Xiaodan Liang
Zhiting Hu
Jinliang Wei
P. Xie
Eric Xing
GNN
50
343
0
11 Jun 2017
Efficient Processing of Deep Neural Networks: A Tutorial and Survey
Vivienne Sze
Yu-hsin Chen
Tien-Ju Yang
J. Emer
AAML
3DV
94
3,002
0
27 Mar 2017
Federated Learning: Strategies for Improving Communication Efficiency
Jakub Konecný
H. B. McMahan
Felix X. Yu
Peter Richtárik
A. Suresh
Dave Bacon
FedML
269
4,620
0
18 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
355
2,922
0
15 Sep 2016
Layer Normalization
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
246
10,412
0
21 Jul 2016
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
336
18,300
0
27 May 2016
Revisiting Distributed Synchronous SGD
Jianmin Chen
Xinghao Pan
R. Monga
Samy Bengio
Rafal Jozefowicz
64
799
0
04 Apr 2016
A distributed block coordinate descent method for training
l
1
l_1
l
1
regularized linear classifiers
D. Mahajan
S. Keerthi
S. Sundararajan
104
36
0
18 May 2014
One weird trick for parallelizing convolutional neural networks
A. Krizhevsky
GNN
74
1,297
0
23 Apr 2014
Parallel Coordinate Descent for L1-Regularized Loss Minimization
Joseph K. Bradley
Aapo Kyrola
Danny Bickson
Carlos Guestrin
66
309
0
26 May 2011
1