ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.03389
  4. Cited By
An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs
  on Cloud

An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud

8 September 2021
Liang Hu
Jiangcheng Zhu
Zirui Zhou
Ruiqing Cheng
Xiaolong Bai
Yong Zhang
ArXivPDFHTML

Papers citing "An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud"

3 / 3 papers shown
Title
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your
  Language Model Thrives on Quality Data
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data
Calvin Tan
Jerome Wang
ALM
38
2
0
07 Aug 2024
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
40
11
0
16 Oct 2023
End-to-end Adaptive Distributed Training on PaddlePaddle
End-to-end Adaptive Distributed Training on PaddlePaddle
Yulong Ao
Zhihua Wu
Dianhai Yu
Weibao Gong
Zhiqing Kui
...
Yanjun Ma
Tian Wu
Haifeng Wang
Wei Zeng
Chao Yang
19
10
0
06 Dec 2021
1