ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1907.01484
  4. Cited By
Themis: Fair and Efficient GPU Cluster Scheduling

Themis: Fair and Efficient GPU Cluster Scheduling

2 July 2019
Kshiteej S. Mahajan
Arjun Balasubramanian
Arjun Singhvi
Shivaram Venkataraman
Aditya Akella
Amar Phanishayee
Shuchi Chawla
ArXivPDFHTML

Papers citing "Themis: Fair and Efficient GPU Cluster Scheduling"

13 / 13 papers shown
Title
A Codesign of Scheduling and Parallelization for Large Model Training in
  Heterogeneous Clusters
A Codesign of Scheduling and Parallelization for Large Model Training in Heterogeneous Clusters
Chunyu Xue
Weihao Cui
Han Zhao
Quan Chen
Shulai Zhang
Peng Yang
Jing Yang
Shaobo Li
Minyi Guo
56
2
0
24 Mar 2024
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Compass: A Decentralized Scheduler for Latency-Sensitive ML Workflows
Yuting Yang
Andrea Merlina
Weijia Song
Tiancheng Yuan
Ken Birman
Roman Vitenberg
49
0
0
27 Feb 2024
Towards providing reliable job completion time predictions using PCS
Towards providing reliable job completion time predictions using PCS
Abdullah Bin Faisal
Noah Martin
Hafiz Mohsin Bashir
Swaminathan Lamelas
Fahad R. Dogar
22
0
0
18 Jan 2024
Energy-Efficient GPU Clusters Scheduling for Deep Learning
Energy-Efficient GPU Clusters Scheduling for Deep Learning
Diandian Gu
Xintong Xie
Gang Huang
Xin Jin
Xuanzhe Liu
GNN
24
7
0
13 Apr 2023
Pathways: Asynchronous Distributed Dataflow for ML
Pathways: Asynchronous Distributed Dataflow for ML
P. Barham
Aakanksha Chowdhery
J. Dean
Sanjay Ghemawat
Steven Hand
...
Parker Schuh
Ryan Sepassi
Laurent El Shafey
C. A. Thekkath
Yonghui Wu
GNN
MoE
45
126
0
23 Mar 2022
GADGET: Online Resource Optimization for Scheduling Ring-All-Reduce
  Learning Jobs
GADGET: Online Resource Optimization for Scheduling Ring-All-Reduce Learning Jobs
Menglu Yu
Ye Tian
Bo Ji
Chuan Wu
Hridesh Rajan
Jia-Wei Liu
16
17
0
02 Feb 2022
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for
  Distributed Training Jobs
TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs
Weiyang Wang
Moein Khazraee
Zhizhen Zhong
M. Ghobadi
Zhihao Jia
Dheevatsa Mudigere
Ying Zhang
A. Kewitsch
39
85
0
01 Feb 2022
Egeria: Efficient DNN Training with Knowledge-Guided Layer Freezing
Egeria: Efficient DNN Training with Knowledge-Guided Layer Freezing
Yiding Wang
D. Sun
Kai Chen
Fan Lai
Mosharaf Chowdhury
33
44
0
17 Jan 2022
Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card
Enabling Level-4 Autonomous Driving on a Single 1kOff−the−ShelfCard1k Off-the-Shelf Card1kOff−the−ShelfCard
Hsin-Hsuan Sung
Yuanchao Xu
Jiexiong Guan
Wei Niu
Shaoshan Liu
Bin Ren
Yanzhi Wang
Xipeng Shen
23
3
0
12 Oct 2021
Serving DNN Models with Multi-Instance GPUs: A Case of the
  Reconfigurable Machine Scheduling Problem
Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
Cheng Tan
Zhichao Li
Jian Zhang
Yunyin Cao
Sikai Qi
Zherui Liu
Yibo Zhu
Chuanxiong Guo
31
34
0
18 Sep 2021
VirtualFlow: Decoupling Deep Learning Models from the Underlying
  Hardware
VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware
Andrew Or
Haoyu Zhang
M. Freedman
17
9
0
20 Sep 2020
Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage
  Trees
Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees
Ahnjae Shin
Do Yoon Kim
Joo Seong Jeong
Byung-Gon Chun
25
4
0
22 Jun 2020
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Zhehuai Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
718
6,750
0
26 Sep 2016
1