Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.11067
Cited By
Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem
18 September 2021
Cheng Tan
Zhichao Li
Jian Zhang
Yunyin Cao
Sikai Qi
Zherui Liu
Yibo Zhu
Chuanxiong Guo
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem"
12 / 12 papers shown
Title
LithOS: An Operating System for Efficient Machine Learning on GPUs
Patrick H. Coppock
Brian Zhang
Eliot H. Solomon
Vasilis Kypriotis
Leon Yang
Bikash Sharma
Dan Schatzberg
Todd C. Mowry
Dimitrios Skarlatos
40
0
0
21 Apr 2025
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location
Ting Sun
Penghan Wang
Fan Lai
193
1
0
15 Jan 2025
ParvaGPU: Efficient Spatial GPU Sharing for Large-Scale DNN Inference in Cloud Environments
Munkyu Lee
Sihoon Seong
Minki Kang
Jihyuk Lee
Gap-Joo Na
In-Geol Chun
Dimitrios Nikolopoulos
Cheol-Ho Hong
GNN
29
0
0
22 Sep 2024
Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration
Tianyu Wang
Sheng Li
Bingyao Li
Yuezhen Dai
Ao Li
Geng Yuan
Yufei Ding
Youtao Zhang
Xulong Tang
40
0
0
18 Jul 2024
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
Jiangfei Duan
Runyu Lu
Haojie Duanmu
Xiuhong Li
Xingcheng Zhang
Dahua Lin
Ion Stoica
Hao Zhang
50
9
0
02 Apr 2024
A Unified CPU-GPU Protocol for GNN Training
Yi-Chien Lin
Gangda Deng
Viktor Prasanna
GNN
3DH
34
2
0
25 Mar 2024
A Survey of Serverless Machine Learning Model Inference
Kamil Kojs
46
2
0
22 Nov 2023
MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs
Huaizheng Zhang
Yuanming Li
Wencong Xiao
Yizheng Huang
Xing Di
Jianxiong Yin
Simon See
Yong Luo
C. Lau
Yang You
VLM
31
3
0
01 Jan 2023
iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud
Fei Xu
Jianian Xu
Jiabin Chen
Li Chen
Ruitao Shang
Zhi Zhou
Fengyuan Liu
GNN
32
35
0
03 Nov 2022
GMI-DRL: Empowering Multi-GPU Deep Reinforcement Learning with GPU Spatial Multiplexing
Yuke Wang
Boyuan Feng
Zihan Wang
Tong Geng
Ang Li
Yufei Ding
AI4CE
49
0
0
16 Jun 2022
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Wei Gao
Qi Hu
Zhisheng Ye
Peng Sun
Xiaolin Wang
Yingwei Luo
Tianwei Zhang
Yonggang Wen
88
26
0
24 May 2022
A Survey of Multi-Tenant Deep Learning Inference on GPU
Fuxun Yu
Di Wang
Longfei Shangguan
Minjia Zhang
Chenchen Liu
Xiang Chen
BDL
AI4CE
26
32
0
17 Mar 2022
1