Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03619
Cited By
Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training
8 November 2018
Youjie Li
Hang Qiu
Songze Li
A. Avestimehr
N. Kim
A. Schwing
FedML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pipe-SGD: A Decentralized Pipelined SGD Framework for Distributed Deep Net Training"
45 / 45 papers shown
Title
Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey
Zhihong Liu
Xin Xu
Peng Qiao
Dongsheng Li
OffRL
27
2
0
08 Nov 2024
Pipeline Gradient-based Model Training on Analog In-memory Accelerators
Zhaoxian Wu
Quan-Wu Xiao
Tayfun Gokmen
H. Tsai
Kaoutar El Maghraoui
Tianyi Chen
24
1
0
19 Oct 2024
FedEx: Expediting Federated Learning over Heterogeneous Mobile Devices by Overlapping and Participant Selection
Jiaxiang Geng
Boyu Li
Xiaoqi Qin
Yixuan Li
Liang Li
Yanzhao Hou
Miao Pan
FedML
40
0
0
01 Jul 2024
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Ziheng Jiang
Yanghua Peng
Yinmin Zhong
Qi Huang
Yangrui Chen
...
Zhe Li
X. Jia
Jia-jun Ye
Xin Jin
Xin Liu
LRM
46
105
0
23 Feb 2024
MSPipe: Efficient Temporal GNN Training via Staleness-Aware Pipeline
Guangming Sheng
Junwei Su
Chao Huang
Chuan Wu
36
5
0
23 Feb 2024
DIGEST: Fast and Communication Efficient Decentralized Learning with Local Updates
Peyman Gholami
H. Seferoglu
FedML
18
11
0
14 Jul 2023
FreshGNN: Reducing Memory Access via Stable Historical Embeddings for Graph Neural Network Training
Kezhao Huang
Haitian Jiang
Minjie Wang
Guangxuan Xiao
David Wipf
Xiang Song
Quan Gan
Zengfeng Huang
Jidong Zhai
Zheng-Wei Zhang
GNN
33
2
0
18 Jan 2023
Personalized Federated Learning with Communication Compression
El Houcine Bergou
Konstantin Burlachenko
Aritra Dutta
Peter Richtárik
FedML
80
9
0
12 Sep 2022
Towards Efficient Communications in Federated Learning: A Contemporary Survey
Zihao Zhao
Yuzhu Mao
Yang Liu
Linqi Song
Ouyang Ye
Xinlei Chen
Wenbo Ding
FedML
59
60
0
02 Aug 2022
Reducing Training Time in Cross-Silo Federated Learning using Multigraph Topology
Tuong Khanh Long Do
Binh X. Nguyen
Vuong Pham
Toan V. Tran
Erman Tjiputra
Quang-Dieu Tran
A. Nguyen
FedML
AI4CE
36
3
0
20 Jul 2022
Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees
Jue Wang
Binhang Yuan
Luka Rimanic
Yongjun He
Tri Dao
Beidi Chen
Christopher Ré
Ce Zhang
AI4CE
24
11
0
02 Jun 2022
Decentralized Training of Foundation Models in Heterogeneous Environments
Binhang Yuan
Yongjun He
Jared Davis
Tianyi Zhang
Tri Dao
Beidi Chen
Percy Liang
Christopher Ré
Ce Zhang
33
90
0
02 Jun 2022
Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep Learning
Tomer Avidor
Nadav Tal-Israel
14
2
0
24 Mar 2022
BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling
Cheng Wan
Youjie Li
Ang Li
Namjae Kim
Yingyan Lin
GNN
37
75
0
21 Mar 2022
PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication
Cheng Wan
Youjie Li
Cameron R. Wolfe
Anastasios Kyrillidis
Namjae Kim
Yingyan Lin
GNN
36
67
0
20 Mar 2022
Harmony: Overcoming the Hurdles of GPU Memory Capacity to Train Massive DNN Models on Commodity Servers
Youjie Li
Amar Phanishayee
D. Murray
Jakub Tarnawski
N. Kim
19
19
0
02 Feb 2022
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
Xiangru Lian
Binhang Yuan
Xuefeng Zhu
Yulong Wang
Yongjun He
...
Lei Yuan
Hai-bo Yu
Sen Yang
Ce Zhang
Ji Liu
VLM
33
34
0
10 Nov 2021
LayerPipe: Accelerating Deep Neural Network Training by Intra-Layer and Inter-Layer Gradient Pipelining and Multiprocessor Scheduling
Nanda K. Unnikrishnan
Keshab K. Parhi
AI4CE
22
5
0
14 Aug 2021
BAGUA: Scaling up Distributed Learning with System Relaxations
Shaoduo Gan
Xiangru Lian
Rui Wang
Jianbin Chang
Chengjun Liu
...
Jiawei Jiang
Binhang Yuan
Sen Yang
Ji Liu
Ce Zhang
25
30
0
03 Jul 2021
CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation
Enda Yu
Dezun Dong
Yemao Xu
Shuo Ouyang
Xiangke Liao
16
5
0
21 Jun 2021
1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed
Conglong Li
A. A. Awan
Hanlin Tang
Samyam Rajbhandari
Yuxiong He
50
33
0
13 Apr 2021
Distributed Learning Systems with First-order Methods
Ji Liu
Ce Zhang
16
44
0
12 Apr 2021
EventGraD: Event-Triggered Communication in Parallel Machine Learning
Soumyadip Ghosh
B. Aquino
V. Gupta
FedML
26
8
0
12 Mar 2021
1-bit Adam: Communication Efficient Large-Scale Training with Adam's Convergence Speed
Hanlin Tang
Shaoduo Gan
A. A. Awan
Samyam Rajbhandari
Conglong Li
Xiangru Lian
Ji Liu
Ce Zhang
Yuxiong He
AI4CE
45
84
0
04 Feb 2021
TornadoAggregate: Accurate and Scalable Federated Learning via the Ring-Based Architecture
Jin-Woo Lee
Jaehoon Oh
Sungsu Lim
Se-Young Yun
Jae-Gil Lee
FedML
33
32
0
06 Dec 2020
APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm
Hanlin Tang
Shaoduo Gan
Samyam Rajbhandari
Xiangru Lian
Ji Liu
Yuxiong He
Ce Zhang
25
8
0
26 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
32
37
0
25 Jul 2020
DBS: Dynamic Batch Size For Distributed Deep Neural Network Training
Qing Ye
Yuhao Zhou
Mingjia Shi
Yanan Sun
Jiancheng Lv
22
11
0
23 Jul 2020
Shuffle-Exchange Brings Faster: Reduce the Idle Time During Communication for Decentralized Neural Network Training
Xiang Yang
FedML
13
2
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism
Jay H. Park
Gyeongchan Yun
Chang Yi
N. T. Nguyen
Seungmin Lee
Jaesik Choi
S. Noh
Young-ri Choi
MoE
25
128
0
28 May 2020
MixML: A Unified Analysis of Weakly Consistent Parallel Learning
Yucheng Lu
J. Nash
Christopher De Sa
FedML
32
12
0
14 May 2020
Detached Error Feedback for Distributed SGD with Random Sparsification
An Xu
Heng-Chiao Huang
39
9
0
11 Apr 2020
Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning
Jinhyun So
Başak Güler
A. Avestimehr
FedML
27
289
0
11 Feb 2020
Intermittent Pulling with Local Compensation for Communication-Efficient Federated Learning
Yining Qi
Zhihao Qu
Song Guo
Xin Gao
Ruixuan Li
Baoliu Ye
FedML
18
8
0
22 Jan 2020
Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
Yanghua Peng
26
41
0
20 Nov 2019
Layer-wise Adaptive Gradient Sparsification for Distributed Deep Learning with Convergence Guarantees
S. Shi
Zhenheng Tang
Qiang-qiang Wang
Kaiyong Zhao
Xiaowen Chu
19
22
0
20 Nov 2019
Central Server Free Federated Learning over Single-sided Trust Social Networks
Chaoyang He
Conghui Tan
Hanlin Tang
Shuang Qiu
Ji Liu
FedML
18
73
0
11 Oct 2019
Heterogeneity-Aware Asynchronous Decentralized Training
Qinyi Luo
Jiaao He
Youwei Zhuo
Xuehai Qian
19
8
0
17 Sep 2019
DeepSqueeze
\texttt{DeepSqueeze}
DeepSqueeze
: Decentralization Meets Error-Compensated Compression
Hanlin Tang
Xiangru Lian
Shuang Qiu
Lei Yuan
Ce Zhang
Tong Zhang
Liu
14
49
0
17 Jul 2019
Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Shuheng Shen
Linli Xu
Jingchang Liu
Xianfeng Liang
Yifei Cheng
ODL
FedML
29
24
0
28 Jun 2019
A Bi-Directional Co-Design Approach to Enable Deep Learning on IoT Devices
Xiaofan Zhang
Cong Hao
Yuhong Li
Yao Chen
Jinjun Xiong
Wen-mei W. Hwu
Deming Chen
21
13
0
20 May 2019
DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression
Hanlin Tang
Xiangru Lian
Chen Yu
Tong Zhang
Ji Liu
11
217
0
15 May 2019
CodedReduce: A Fast and Robust Framework for Gradient Aggregation in Distributed Learning
Amirhossein Reisizadeh
Saurav Prakash
Ramtin Pedarsani
A. Avestimehr
28
23
0
06 Feb 2019
Fully Asynchronous Distributed Optimization with Linear Convergence in Directed Networks
Jiaqi Zhang
Keyou You
13
17
0
24 Jan 2019
1