TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning

22 May 2017

Yiran Chen

Papers citing "TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning"

50 / 467 papers shown

Title
Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication Anastasia Koloskova Sebastian U. Stich Martin Jaggi FedML 25 503 0 01 Feb 2019
Error Feedback Fixes SignSGD and other Gradient Compression Schemes Sai Praneeth Karimireddy Quentin Rebjock Sebastian U. Stich Martin Jaggi 27 493 0 28 Jan 2019
99% of Distributed Optimization is a Waste of Time: The Issue and How to Fix it Konstantin Mishchenko Filip Hanzely Peter Richtárik 16 13 0 27 Jan 2019
PruneTrain: Fast Neural Network Training by Dynamic Sparse Model Reconfiguration Sangkug Lym Esha Choukse Siavash Zangeneh W. Wen Sujay Sanghavi M. Erez CVBM 12 88 0 26 Jan 2019
Distributed Learning with Compressed Gradient Differences Konstantin Mishchenko Eduard A. Gorbunov Martin Takáč Peter Richtárik 21 198 0 26 Jan 2019
Trajectory Normalized Gradients for Distributed Optimization Jianqiao Wangni Ke Li Jianbo Shi Jitendra Malik 27 2 0 24 Jan 2019
Backprop with Approximate Activations for Memory-efficient Network Training Ayan Chakrabarti Benjamin Moseley 24 37 0 23 Jan 2019
Accelerated Training for CNN Distributed Deep Learning through Automatic Resource-Aware Layer Placement Jay H. Park Sunghwan Kim Jinwon Lee Myeongjae Jeon S. Noh 27 11 0 17 Jan 2019
A Distributed Synchronous SGD Algorithm with Global Top- $k$ Sparsification for Low Bandwidth Networks Shaoshuai Shi Qiang-qiang Wang Kaiyong Zhao Zhenheng Tang Yuxin Wang Xiang Huang Xuming Hu 40 135 0 14 Jan 2019
Quantized Epoch-SGD for Communication-Efficient Distributed Learning Shen-Yi Zhao Hao Gao Wu-Jun Li FedML 22 3 0 10 Jan 2019
Bandwidth Reduction using Importance Weighted Pruning on Ring AllReduce Zehua Cheng Zhenghua Xu 22 8 0 06 Jan 2019
Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air Mohammad Mohammadi Amiri Deniz Gunduz 30 53 0 03 Jan 2019
Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm Charbel Sakr Naresh R Shanbhag MQ 25 43 0 31 Dec 2018
Stanza: Layer Separation for Distributed Training in Deep Learning Xiaorui Wu Hongao Xu Bo Li Y. Xiong MoE 28 9 0 27 Dec 2018
Learning Private Neural Language Modeling with Attentive Aggregation Shaoxiong Ji Shirui Pan Guodong Long Xue Li Jing Jiang Zi Huang FedML MoMe 18 137 0 17 Dec 2018
Compressed Distributed Gradient Descent: Communication-Efficient Consensus over Networks Xin Zhang Jia Liu Zhengyuan Zhu Elizabeth S. Bentley 10 27 0 10 Dec 2018
No Peek: A Survey of private distributed deep learning Praneeth Vepakomma Tristan Swedish Ramesh Raskar O. Gupta Abhimanyu Dubey SyDa FedML 30 100 0 08 Dec 2018
Wireless Network Intelligence at the Edge Jihong Park S. Samarakoon M. Bennis Mérouane Debbah 23 518 0 07 Dec 2018
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms Shaoshuai Shi Xiaowen Chu Bo Li FedML 24 89 0 27 Nov 2018
Stochastic Gradient Push for Distributed Deep Learning Mahmoud Assran Nicolas Loizou Nicolas Ballas Michael G. Rabbat 30 343 0 27 Nov 2018
SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks Linnan Wang Wei Wu Junyu Zhang Hang Liu G. Bosilca Maurice Herlihy Rodrigo Fonseca GNN 23 5 0 21 Nov 2018
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training Youjie Li Hang Qiu Songze Li A. Avestimehr Nam Sung Kim Alex Schwing FedML 24 104 0 08 Nov 2018
GradiVeQ: Vector Quantization for Bandwidth-Efficient Gradient Aggregation in Distributed CNN Training Timo C. Wunderlich Zhifeng Lin S. A. Aamir Andreas Grübl Youjie Li David Stöckel Alex Schwing M. Annavaram A. Avestimehr MQ 19 64 0 08 Nov 2018
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks K. Chahal Manraj Singh Grover Kuntal Dey 3DH OOD 6 53 0 28 Oct 2018
Distributed Learning over Unreliable Networks Chen Yu Hanlin Tang Cédric Renggli S. Kassing Ankit Singla Dan Alistarh Ce Zhang Ji Liu OOD 25 60 0 17 Oct 2018
signSGD with Majority Vote is Communication Efficient And Fault Tolerant Jeremy Bernstein Jiawei Zhao Kamyar Azizzadenesheli Anima Anandkumar FedML 31 46 0 11 Oct 2018
Dynamic Sparse Graph for Efficient Deep Learning L. Liu Lei Deng Xing Hu Maohua Zhu Guoqi Li Yufei Ding Yuan Xie GNN 40 42 0 01 Oct 2018
The Convergence of Sparsified Gradient Methods Dan Alistarh Torsten Hoefler M. Johansson Sarit Khirirat Nikola Konstantinov Cédric Renggli 30 489 0 27 Sep 2018
Sparsified SGD with Memory Sebastian U. Stich Jean-Baptiste Cordonnier Martin Jaggi 41 740 0 20 Sep 2018
Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms Jianyu Wang Gauri Joshi 33 348 0 22 Aug 2018
Don't Use Large Mini-Batches, Use Local SGD Tao R. Lin Sebastian U. Stich Kumar Kshitij Patel Martin Jaggi 57 429 0 22 Aug 2018
RedSync : Reducing Synchronization Traffic for Distributed Deep Learning Jiarui Fang Haohuan Fu Guangwen Yang Cho-Jui Hsieh GNN 22 25 0 13 Aug 2018
A Survey on Methods and Theories of Quantized Neural Networks Yunhui Guo MQ 34 232 0 13 Aug 2018
DFTerNet: Towards 2-bit Dynamic Fusion Networks for Accurate Human Activity Recognition Zhan Yang Osolo Ian Raymond Chengyuan Zhang Ying Wan J. Long CVBM 47 36 0 31 Jul 2018
Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning Hao Yu Sen Yang Shenghuo Zhu MoMe FedML 38 597 0 17 Jul 2018
Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization Jiaxiang Wu Weidong Huang Junzhou Huang Tong Zhang 24 235 0 21 Jun 2018
Distributed learning with compressed gradients Sarit Khirirat Hamid Reza Feyzmahdavian M. Johansson 33 83 0 18 Jun 2018
ATOMO: Communication-efficient Learning via Atomic Sparsification Hongyi Wang Scott Sievert Zachary B. Charles Shengchao Liu S. Wright Dimitris Papailiopoulos 22 351 0 11 Jun 2018
The Effect of Network Width on the Performance of Large-batch Training Lingjiao Chen Hongyi Wang Jinman Zhao Dimitris Papailiopoulos Paraschos Koutris 29 22 0 11 Jun 2018
Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training Maohua Zhu Jason Clemons Jeff Pool Minsoo Rhu S. Keckler Yuan Xie 21 13 0 01 Jun 2018
On Consensus-Optimality Trade-offs in Collaborative Deep Learning Zhanhong Jiang Aditya Balu Chinmay Hegde Soumik Sarkar FedML 33 7 0 30 May 2018
cpSGD: Communication-efficient and differentially-private distributed SGD Naman Agarwal A. Suresh Felix X. Yu Sanjiv Kumar H. B. McMahan FedML 28 486 0 27 May 2018
Scalable Methods for 8-bit Training of Neural Networks Ron Banner Itay Hubara Elad Hoffer Daniel Soudry MQ 54 332 0 25 May 2018
Double Quantization for Communication-Efficient Distributed Optimization Yue Yu Jiaxiang Wu Longbo Huang MQ 19 57 0 25 May 2018
LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning Tianyi Chen G. Giannakis Tao Sun W. Yin 34 297 0 25 May 2018
Local SGD Converges Fast and Communicates Little Sebastian U. Stich FedML 85 1,047 0 24 May 2018
Approximate Random Dropout Zhuoran Song Ru Wang Dongyu Ru Hongru Huang Zhenghao Peng Hai Zhao Xiaoyao Liang Li Jiang BDL 30 9 0 23 May 2018
Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication Felix Sattler Simon Wiedemann K. Müller Wojciech Samek MQ 36 212 0 22 May 2018
Faster Neural Network Training with Approximate Tensor Operations Menachem Adelman Kfir Y. Levy Ido Hakimi M. Silberstein 31 26 0 21 May 2018
Compressed Coded Distributed Computing Songze Li M. Maddah-ali A. Avestimehr 24 35 0 05 May 2018