Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.03389
Cited By
An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud
8 September 2021
Liang Hu
Jiangcheng Zhu
Zirui Zhou
Ruiqing Cheng
Xiaolong Bai
Yong Zhang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud"
3 / 3 papers shown
Title
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data
Calvin Tan
Jerome Wang
ALM
38
2
0
07 Aug 2024
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
40
11
0
16 Oct 2023
End-to-end Adaptive Distributed Training on PaddlePaddle
Yulong Ao
Zhihua Wu
Dianhai Yu
Weibao Gong
Zhiqing Kui
...
Yanjun Ma
Tian Wu
Haifeng Wang
Wei Zeng
Chao Yang
19
10
0
06 Dec 2021
1