Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.14465
Cited By
v1
v2 (latest)
ByteComp: Revisiting Gradient Compression in Distributed Training
28 May 2022
Zhuang Wang
Yanghua Peng
Yibo Zhu
T. Ng
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ByteComp: Revisiting Gradient Compression in Distributed Training"
29 / 29 papers shown
Title
Compressed Communication for Distributed Training: Adaptive Methods and System
Yuchen Zhong
Cong Xie
Shuai Zheng
Yanghua Peng
72
9
0
17 May 2021
MergeComp: A Compression Scheduler for Scalable Communication-Efficient Distributed Training
Zhuang Wang
X. Wu
T. Ng
GNN
22
4
0
28 Mar 2021
Is Network the Bottleneck of Distributed Training?
Zhen Zhang
Chaokun Chang
Yanghua Peng
Yida Wang
R. Arora
Xin Jin
89
71
0
17 Jun 2020
Ansor: Generating High-Performance Tensor Programs for Deep Learning
Lianmin Zheng
Chengfan Jia
Minmin Sun
Zhao Wu
Cody Hao Yu
...
Jun Yang
Danyang Zhuo
Koushik Sen
Joseph E. Gonzalez
Ion Stoica
144
403
0
11 Jun 2020
Blink: Fast and Generic Collectives for Distributed ML
Guanhua Wang
Shivaram Venkataraman
Amar Phanishayee
J. Thelin
Nikhil R. Devanur
Ion Stoica
VLM
56
140
0
11 Oct 2019
U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation
Junho Kim
Minjae Kim
Hyeonwoo Kang
Kwanghee Lee
ViT
48
560
0
25 Jul 2019
Priority-based Parameter Propagation for Distributed DNN Training
Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
AI4CE
55
182
0
10 May 2019
Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet/AlexNet Training in 1.5 Minutes
Peng Sun
Wansen Feng
Ruobing Han
Shengen Yan
Yonggang Wen
AI4CE
88
70
0
19 Feb 2019
Error Feedback Fixes SignSGD and other Gradient Compression Schemes
Sai Praneeth Karimireddy
Quentin Rebjock
Sebastian U. Stich
Martin Jaggi
85
503
0
28 Jan 2019
Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization
Jiaxiang Wu
Weidong Huang
Junzhou Huang
Tong Zhang
86
236
0
21 Jun 2018
Know What You Don't Know: Unanswerable Questions for SQuAD
Pranav Rajpurkar
Robin Jia
Percy Liang
RALM
ELM
292
2,854
0
11 Jun 2018
Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training
Liang Luo
Jacob Nelson
Luis Ceze
Amar Phanishayee
Arvind Krishnamurthy
131
121
0
21 May 2018
TicTac: Accelerating Distributed Deep Learning with Communication Scheduling
Sayed Hadi Hashemi
Sangeetha Abdu Jyothi
R. Campbell
48
199
0
08 Mar 2018
Horovod: fast and easy distributed deep learning in TensorFlow
Alexander Sergeev
Mike Del Balso
102
1,222
0
15 Feb 2018
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
Chia-Yu Chen
Jungwook Choi
D. Brand
A. Agrawal
Wei Zhang
K. Gopalakrishnan
ODL
52
174
0
07 Dec 2017
Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training
Chengyue Wu
Song Han
Huizi Mao
Yu Wang
W. Dally
152
1,410
0
05 Dec 2017
Gradient Sparsification for Communication-Efficient Distributed Optimization
Jianqiao Wangni
Jialei Wang
Ji Liu
Tong Zhang
100
529
0
26 Oct 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
173
1,096
0
07 Aug 2017
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Huatian Zhang
Zeyu Zheng
Shizhen Xu
Wei-Ming Dai
Qirong Ho
Xiaodan Liang
Zhiting Hu
Jinliang Wei
P. Xie
Eric Xing
GNN
72
348
0
11 Jun 2017
TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning
W. Wen
Cong Xu
Feng Yan
Chunpeng Wu
Yandan Wang
Yiran Chen
Hai Helen Li
184
990
0
22 May 2017
Sparse Communication for Distributed Gradient Descent
Alham Fikri Aji
Kenneth Heafield
89
742
0
17 Apr 2017
In-Datacenter Performance Analysis of a Tensor Processing Unit
N. Jouppi
C. Young
Nishant Patil
David Patterson
Gaurav Agrawal
...
Vijay Vasudevan
Richard Walter
Walter Wang
Eric Wilcox
Doe Hyun Yoon
239
4,644
0
16 Apr 2017
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
346
2,900
0
26 Sep 2016
TensorFlow: A system for large-scale machine learning
Martín Abadi
P. Barham
Jianmin Chen
Zhiwen Chen
Andy Davis
...
Vijay Vasudevan
Pete Warden
Martin Wicke
Yuan Yu
Xiaoqiang Zhang
GNN
AI4CE
435
18,361
0
27 May 2016
Revisiting Distributed Synchronous SGD
Jianmin Chen
Xinghao Pan
R. Monga
Samy Bengio
Rafal Jozefowicz
89
801
0
04 Apr 2016
MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems
Tianqi Chen
Mu Li
Yutian Li
Min Lin
Naiyan Wang
Minjie Wang
Tianjun Xiao
Bing Xu
Chiyuan Zhang
Zheng Zhang
200
2,248
0
03 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
886
27,427
0
02 Dec 2015
8-Bit Approximations for Parallelism in Deep Learning
Tim Dettmers
81
176
0
14 Nov 2015
cuDNN: Efficient Primitives for Deep Learning
Sharan Chetlur
Cliff Woolley
Philippe Vandermersch
Jonathan M. Cohen
J. Tran
Bryan Catanzaro
Evan Shelhamer
140
1,850
0
03 Oct 2014
1