ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.03161
  4. Cited By
PipeTransformer: Automated Elastic Pipelining for Distributed Training
  of Transformers

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers

5 February 2021
Chaoyang He
Shen Li
Mahdi Soltanolkotabi
Salman Avestimehr
ArXivPDFHTML

Papers citing "PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers"

8 / 8 papers shown
Title
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling
Xiao Wang
Jong Youl Choi
Takuya Kurihaya
Isaac Lyngaas
Hong-Jun Yoon
...
Dali Wang
Peter Thornton
Prasanna Balaprakash
M. Ashfaq
Dan Lu
30
0
0
07 May 2025
TAGC: Optimizing Gradient Communication in Distributed Transformer Training
TAGC: Optimizing Gradient Communication in Distributed Transformer Training
Igor Polyakov
Alexey Dukhanov
Egor Spirin
49
0
0
08 Apr 2025
ProgFed: Effective, Communication, and Computation Efficient Federated
  Learning by Progressive Training
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training
Hui-Po Wang
Sebastian U. Stich
Yang He
Mario Fritz
FedML
AI4CE
36
48
0
11 Oct 2021
BAGUA: Scaling up Distributed Learning with System Relaxations
BAGUA: Scaling up Distributed Learning with System Relaxations
Shaoduo Gan
Xiangru Lian
Rui Wang
Jianbin Chang
Chengjun Liu
...
Jiawei Jiang
Binhang Yuan
Sen Yang
Ji Liu
Ce Zhang
31
30
0
03 Jul 2021
Subgraph Federated Learning with Missing Neighbor Generation
Subgraph Federated Learning with Missing Neighbor Generation
Ke Zhang
Carl Yang
Xiaoxiao Li
Lichao Sun
Siu-Ming Yiu
FedML
30
164
0
25 Jun 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using
  Megatron-LM
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
M. Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
37
656
0
09 Apr 2021
Reservoir Transformers
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,836
0
17 Sep 2019
1