TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with
Auto-Parallelism

TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism

16 April 2020

Papers citing "TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism"

15 / 15 papers shown

Title
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training Jared Fernandez Luca Wehrstedt Leonid Shamis Mostafa Elhoushi Kalyan Saladi Yonatan Bisk Emma Strubell Jacob Kahn 245 3 0 20 Nov 2024
EinDecomp: Decomposition of Declaratively-Specified Machine Learning and Numerical Computations for Parallel Execution Daniel Bourgeois Zhimin Ding Dimitrije Jankov Jiehui Li Mahmoud Sleem Yuxin Tang Jiawen Yao Xinyu Yao Chris Jermaine 33 0 0 03 Oct 2024
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey Jiangfei Duan Shuo Zhang Zerui Wang Lijuan Jiang Wenwen Qu ... Dahua Lin Yonggang Wen Xin Jin Tianwei Zhang Peng Sun 73 8 0 29 Jul 2024
SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures Swapnil Gandhi Mark Zhao Athinagoras Skiadopoulos Christos Kozyrakis AI4CE GNN 49 8 0 22 May 2024
Model Parallelism on Distributed Infrastructure: A Literature Review from Theory to LLM Case-Studies Felix Brakel Uraz Odyurt A. Varbanescu GNN 39 11 0 06 Mar 2024
AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training Qiaoling Chen Qi Hu Guoteng Wang Zhisheng Ye Ting Huang ... Yang Gao Hang Yan Yonggang Wen Tianwei Zhang Peng Sun 37 6 0 01 Nov 2023
Automatic Task Parallelization of Dataflow Graphs in ML/DL models Srinjoy Das Lawrence Rauchwerger 19 0 0 22 Aug 2023
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming Hao Lin Ke Wu Jie Li Jun Yu Li Wu-Jun Li 39 1 0 31 Jul 2023
Improving Automatic Parallel Training via Balanced Memory Workload Optimization Yujie Wang Youhe Jiang Xupeng Miao Fangcheng Fu Shenhan Zhu Xiaonan Nie Yaofeng Tu Bin Cui 48 9 0 05 Jul 2023
TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks Peng Liang Hao Zheng Teng Su Linbo Qiao Dongsheng Li 30 0 0 11 Jan 2023
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism Xupeng Miao Yujie Wang Youhe Jiang Chunan Shi Xiaonan Nie Hailin Zhang Bin Cui GNN MoE 45 60 0 25 Nov 2022
Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning Lianmin Zheng Zhuohan Li Hao Zhang Yonghao Zhuang Zhifeng Chen ... Yuanzhong Xu Danyang Zhuo Eric P. Xing Joseph E. Gonzalez Ion Stoica MoE 30 104 0 28 Jan 2022
End-to-end Adaptive Distributed Training on PaddlePaddle Yulong Ao Zhihua Wu Dianhai Yu Weibao Gong Zhiqing Kui ... Yanjun Ma Tian Wu Haifeng Wang Wei Zeng Chao Yang 19 10 0 06 Dec 2021
Local Critic Training for Model-Parallel Learning of Deep Neural Networks Hojung Lee Cho-Jui Hsieh Jong-Seok Lee 28 15 0 03 Feb 2021
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation Yonghui Wu M. Schuster Z. Chen Quoc V. Le Mohammad Norouzi ... Alex Rudnick Oriol Vinyals G. Corrado Macduff Hughes J. Dean AIMat 716 6,748 0 26 Sep 2016