Communication-Efficient Distributed Deep Learning: A Comprehensive Survey

10 March 2020

Papers citing "Communication-Efficient Distributed Deep Learning: A Comprehensive Survey"

35 / 35 papers shown

Title
Asynchronous SGD Beats Minibatch SGD Under Arbitrary Delays Konstantin Mishchenko Francis R. Bach Mathieu Even Blake E. Woodworth 51 59 0 15 Jun 2022
Virtual Homogeneity Learning: Defending against Data Heterogeneity in Federated Learning Zhenheng Tang Yonggang Zhang Shaoshuai Shi Xinfu He Bo Han Xiaowen Chu FedML 63 74 0 06 Jun 2022
Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks Shaoshuai Shi Lin Zhang Yue Liu 84 9 0 14 Jul 2021
Communication-efficient SGD: From Local SGD to One-Shot Averaging Artin Spiridonoff Alexander Olshevsky I. Paschalidis FedML 61 20 0 09 Jun 2021
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning Divyansh Jhunjhunwala Advait Gadhikar Gauri Joshi Yonina C. Eldar FedML MQ 38 108 0 08 Feb 2021
A Survey of Deep Learning Techniques for Neural Machine Translation Shu Yang Yuxin Wang Xiaowen Chu VLM AI4TS AI4CE 45 139 0 18 Feb 2020
Blink: Fast and Generic Collectives for Distributed ML Guanhua Wang Shivaram Venkataraman Amar Phanishayee J. Thelin Nikhil R. Devanur Ion Stoica VLM 31 137 0 11 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi M. Patwary Raul Puri P. LeGresley Jared Casper Bryan Catanzaro MoE 288 1,861 0 17 Sep 2019
Priority-based Parameter Propagation for Distributed DNN Training Anand Jayarajan Jinliang Wei Garth A. Gibson Alexandra Fedorova Gennady Pekhimenko AI4CE 33 178 0 10 May 2019
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Yang You Jing Li Sashank J. Reddi Jonathan Hseu Sanjiv Kumar Srinadh Bhojanapalli Xiaodan Song J. Demmel Kurt Keutzer Cho-Jui Hsieh ODL 163 991 0 01 Apr 2019
Robust and Communication-Efficient Federated Learning from Non-IID Data Felix Sattler Simon Wiedemann K. Müller Wojciech Samek FedML 44 1,343 0 07 Mar 2019
Distributed Learning with Sparse Communications by Identification Dmitry Grishchenko F. Iutzeler J. Malick Massih-Reza Amini 32 19 0 10 Dec 2018
Stochastic Gradient Push for Distributed Deep Learning Mahmoud Assran Nicolas Loizou Nicolas Ballas Michael G. Rabbat 51 342 0 27 Nov 2018
Measuring the Effects of Data Parallelism on Neural Network Training Christopher J. Shallue Jaehoon Lee J. Antognini J. Mamou J. Ketterling Yao Wang 68 408 0 08 Nov 2018
A Dual Approach for Optimal Algorithms in Distributed Optimization over Networks César A. Uribe Soomin Lee Alexander Gasnikov A. Nedić 29 137 0 03 Sep 2018
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms Jianyu Wang Gauri Joshi 101 348 0 22 Aug 2018
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning Tianyi Chen G. Giannakis Tao Sun W. Yin 51 298 0 25 May 2018
Local SGD Converges Fast and Communicates Little Sebastian U. Stich FedML 152 1,056 0 24 May 2018
GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent J. Daily Abhinav Vishnu Charles Siegel T. Warfel Vinay C. Amatya 36 95 0 15 Mar 2018
TicTac: Accelerating Distributed Deep Learning with Communication Scheduling Sayed Hadi Hashemi Sangeetha Abdu Jyothi R. Campbell 32 196 0 08 Mar 2018
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training Chia-Yu Chen Jungwook Choi D. Brand A. Agrawal Wei Zhang K. Gopalakrishnan ODL 40 173 0 07 Dec 2017
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training Chengyue Wu Song Han Huizi Mao Yu Wang W. Dally 102 1,399 0 05 Dec 2017
The TensorFlow Partitioning and Scheduling Problem: It's the Critical Path! R. Mayer C. Mayer Larissa Laich 38 46 0 06 Nov 2017
Revisiting Unreasonable Effectiveness of Data in Deep Learning Era Chen Sun Abhinav Shrivastava Saurabh Singh Abhinav Gupta VLM 108 2,386 0 10 Jul 2017
Device Placement Optimization with Reinforcement Learning Azalia Mirhoseini Hieu H. Pham Quoc V. Le Benoit Steiner Rasmus Larsen Yuefeng Zhou Naveen Kumar Mohammad Norouzi Samy Bengio J. Dean 59 438 0 13 Jun 2017
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters Huatian Zhang Zeyu Zheng Shizhen Xu Wei-Ming Dai Qirong Ho Xiaodan Liang Zhiting Hu Jinliang Wei P. Xie Eric Xing GNN 50 343 0 11 Jun 2017
Efficient Processing of Deep Neural Networks: A Tutorial and Survey Vivienne Sze Yu-hsin Chen Tien-Ju Yang J. Emer AAML 3DV 94 3,002 0 27 Mar 2017
Federated Learning: Strategies for Improving Communication Efficiency Jakub Konecný H. B. McMahan Felix X. Yu Peter Richtárik A. Suresh Dave Bacon FedML 269 4,620 0 18 Oct 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 355 2,922 0 15 Sep 2016
Layer Normalization Jimmy Lei Ba J. Kiros Geoffrey E. Hinton 246 10,412 0 21 Jul 2016
TensorFlow: A system for large-scale machine learning Martín Abadi P. Barham Jianmin Chen Zhiwen Chen Andy Davis ... Vijay Vasudevan Pete Warden Martin Wicke Yuan Yu Xiaoqiang Zhang GNN AI4CE 336 18,300 0 27 May 2016
Revisiting Distributed Synchronous SGD Jianmin Chen Xinghao Pan R. Monga Samy Bengio Rafal Jozefowicz 64 799 0 04 Apr 2016
A distributed block coordinate descent method for training $l_1$ regularized linear classifiers D. Mahajan S. Keerthi S. Sundararajan 104 36 0 18 May 2014
One weird trick for parallelizing convolutional neural networks A. Krizhevsky GNN 74 1,297 0 23 Apr 2014
Parallel Coordinate Descent for L1-Regularized Loss Minimization Joseph K. Bradley Aapo Kyrola Danny Bickson Carlos Guestrin 66 309 0 26 May 2011