Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10856
Cited By
TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism
16 April 2020
Zhenkun Cai
Kaihao Ma
Xiao Yan
Yidi Wu
Yuzhen Huang
James Cheng
Teng Su
F. Yu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism"
15 / 15 papers shown
Title
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
245
3
0
20 Nov 2024
EinDecomp: Decomposition of Declaratively-Specified Machine Learning and Numerical Computations for Parallel Execution
Daniel Bourgeois
Zhimin Ding
Dimitrije Jankov
Jiehui Li
Mahmoud Sleem
Yuxin Tang
Jiawen Yao
Xinyu Yao
Chris Jermaine
33
0
0
03 Oct 2024
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
Jiangfei Duan
Shuo Zhang
Zerui Wang
Lijuan Jiang
Wenwen Qu
...
Dahua Lin
Yonggang Wen
Xin Jin
Tianwei Zhang
Peng Sun
73
8
0
29 Jul 2024
SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures
Swapnil Gandhi
Mark Zhao
Athinagoras Skiadopoulos
Christos Kozyrakis
AI4CE
GNN
49
8
0
22 May 2024
Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies
Felix Brakel
Uraz Odyurt
A. Varbanescu
GNN
39
11
0
06 Mar 2024
AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
Qiaoling Chen
Qi Hu
Guoteng Wang
Zhisheng Ye
Ting Huang
...
Yang Gao
Hang Yan
Yonggang Wen
Tianwei Zhang
Peng Sun
37
6
0
01 Nov 2023
Automatic Task Parallelization of Dataflow Graphs in ML/DL models
Srinjoy Das
Lawrence Rauchwerger
19
0
0
22 Aug 2023
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Hao Lin
Ke Wu
Jie Li
Jun Yu Li
Wu-Jun Li
39
1
0
31 Jul 2023
Improving Automatic Parallel Training via Balanced Memory Workload Optimization
Yujie Wang
Youhe Jiang
Xupeng Miao
Fangcheng Fu
Shenhan Zhu
Xiaonan Nie
Yaofeng Tu
Bin Cui
48
9
0
05 Jul 2023
TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks
Peng Liang
Hao Zheng
Teng Su
Linbo Qiao
Dongsheng Li
30
0
0
11 Jan 2023
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao
Yujie Wang
Youhe Jiang
Chunan Shi
Xiaonan Nie
Hailin Zhang
Bin Cui
GNN
MoE
45
60
0
25 Nov 2022
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Lianmin Zheng
Zhuohan Li
Hao Zhang
Yonghao Zhuang
Zhifeng Chen
...
Yuanzhong Xu
Danyang Zhuo
Eric P. Xing
Joseph E. Gonzalez
Ion Stoica
MoE
30
104
0
28 Jan 2022
End-to-end Adaptive Distributed Training on PaddlePaddle
Yulong Ao
Zhihua Wu
Dianhai Yu
Weibao Gong
Zhiqing Kui
...
Yanjun Ma
Tian Wu
Haifeng Wang
Wei Zeng
Chao Yang
19
10
0
06 Dec 2021
Local Critic Training for Model-Parallel Learning of Deep Neural Networks
Hojung Lee
Cho-Jui Hsieh
Jong-Seok Lee
28
15
0
03 Feb 2021
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,748
0
26 Sep 2016
1