ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2109.11067
  4. Cited By
Serving DNN Models with Multi-Instance GPUs: A Case of the
  Reconfigurable Machine Scheduling Problem

Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem

18 September 2021
Cheng Tan
Zhichao Li
Jian Zhang
Yunyin Cao
Sikai Qi
Zherui Liu
Yibo Zhu
Chuanxiong Guo
ArXivPDFHTML

Papers citing "Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem"

12 / 12 papers shown
Title
LithOS: An Operating System for Efficient Machine Learning on GPUs
LithOS: An Operating System for Efficient Machine Learning on GPUs
Patrick H. Coppock
Brian Zhang
Eliot H. Solomon
Vasilis Kypriotis
Leon Yang
Bikash Sharma
Dan Schatzberg
Todd C. Mowry
Dimitrios Skarlatos
40
0
0
21 Apr 2025
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun
Penghan Wang
Fan Lai
193
1
0
15 Jan 2025
ParvaGPU: Efficient Spatial GPU Sharing for Large-Scale DNN Inference in
  Cloud Environments
ParvaGPU: Efficient Spatial GPU Sharing for Large-Scale DNN Inference in Cloud Environments
Munkyu Lee
Sihoon Seong
Minki Kang
Jihyuk Lee
Gap-Joo Na
In-Geol Chun
Dimitrios Nikolopoulos
Cheol-Ho Hong
GNN
29
0
0
22 Sep 2024
Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU
  Reconfiguration
Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration
Tianyu Wang
Sheng Li
Bingyao Li
Yuezhen Dai
Ao Li
Geng Yuan
Yufei Ding
Youtao Zhang
Xulong Tang
40
0
0
18 Jul 2024
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM
  Serving
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
Jiangfei Duan
Runyu Lu
Haojie Duanmu
Xiuhong Li
Xingcheng Zhang
Dahua Lin
Ion Stoica
Hao Zhang
50
9
0
02 Apr 2024
A Unified CPU-GPU Protocol for GNN Training
A Unified CPU-GPU Protocol for GNN Training
Yi-Chien Lin
Gangda Deng
Viktor Prasanna
GNN
3DH
34
2
0
25 Mar 2024
A Survey of Serverless Machine Learning Model Inference
A Survey of Serverless Machine Learning Model Inference
Kamil Kojs
46
2
0
22 Nov 2023
MIGPerf: A Comprehensive Benchmark for Deep Learning Training and
  Inference Workloads on Multi-Instance GPUs
MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs
Huaizheng Zhang
Yuanming Li
Wencong Xiao
Yizheng Huang
Xing Di
Jianxiong Yin
Simon See
Yong Luo
C. Lau
Yang You
VLM
31
3
0
01 Jan 2023
iGniter: Interference-Aware GPU Resource Provisioning for Predictable
  DNN Inference in the Cloud
iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud
Fei Xu
Jianian Xu
Jiabin Chen
Li Chen
Ruitao Shang
Zhi Zhou
Fengyuan Liu
GNN
32
35
0
03 Nov 2022
GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU
  Spatial Multiplexing
GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing
Yuke Wang
Boyuan Feng
Zihan Wang
Tong Geng
Ang Li
Yufei Ding
AI4CE
49
0
0
16 Jun 2022
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy,
  Challenges and Vision
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Wei Gao
Qi Hu
Zhisheng Ye
Peng Sun
Xiaolin Wang
Yingwei Luo
Tianwei Zhang
Yonggang Wen
88
26
0
24 May 2022
A Survey of Multi-Tenant Deep Learning Inference on GPU
A Survey of Multi-Tenant Deep Learning Inference on GPU
Fuxun Yu
Di Wang
Longfei Shangguan
Minjia Zhang
Chenchen Liu
Xiang Chen
BDL
AI4CE
26
32
0
17 Mar 2022
1