ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.00041
  4. Cited By
Dynamic Space-Time Scheduling for GPU Inference

Dynamic Space-Time Scheduling for GPU Inference

31 December 2018
Paras Jain
Xiangxi Mo
Ajay Jain
Harikaran Subbaraj
Rehana Durrani
Alexey Tumanov
Joseph E. Gonzalez
Ion Stoica
ArXivPDFHTML

Papers citing "Dynamic Space-Time Scheduling for GPU Inference"

19 / 19 papers shown
Title
CascadeServe: Unlocking Model Cascades for Inference Serving
CascadeServe: Unlocking Model Cascades for Inference Serving
Ferdi Kossmann
Ziniu Wu
Alex Turk
Nesime Tatbul
Lei Cao
Samuel Madden
37
2
0
20 Jun 2024
Hydro: Adaptive Query Processing of ML Queries
Hydro: Adaptive Query Processing of ML Queries
Gaurav Tarlok Kakkar
Jiashen Cao
Aubhro Sengupta
Joy Arulraj
Hyesoon Kim
44
1
0
22 Mar 2024
A Survey of Serverless Machine Learning Model Inference
A Survey of Serverless Machine Learning Model Inference
Kamil Kojs
43
2
0
22 Nov 2023
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?
Seyed Morteza Nabavinejad
M. Ebrahimi
Sherief Reda
27
1
0
26 Aug 2023
Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on
  Edge GPU
Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU
Zhihe Zhao
Neiwen Ling
Nan Guan
Guoliang Xing
34
11
0
10 Jul 2023
D-STACK: High Throughput DNN Inference by Effective Multiplexing and
  Spatio-Temporal Scheduling of GPUs
D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs
Aditya Dhakal
Sameer G. Kulkarni
K. Ramakrishnan
20
3
0
31 Mar 2023
A Study on the Intersection of GPU Utilization and CNN Inference
A Study on the Intersection of GPU Utilization and CNN Inference
J. Kosaian
Amar Phanishayee
23
3
0
15 Dec 2022
iGniter: Interference-Aware GPU Resource Provisioning for Predictable
  DNN Inference in the Cloud
iGniter: Interference-Aware GPU Resource Provisioning for Predictable DNN Inference in the Cloud
Fei Xu
Jianian Xu
Jiabin Chen
Li Chen
Ruitao Shang
Zhi Zhou
Fengyuan Liu
GNN
32
35
0
03 Nov 2022
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy,
  Challenges and Vision
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Wei Gao
Qi Hu
Zhisheng Ye
Peng Sun
Xiaolin Wang
Yingwei Luo
Tianwei Zhang
Yonggang Wen
86
26
0
24 May 2022
Batched matrix operations on distributed GPUs with application in
  theoretical physics
Batched matrix operations on distributed GPUs with application in theoretical physics
Nenad Mijić
Davor Davidović
11
2
0
17 Mar 2022
Characterizing Concurrency Mechanisms for NVIDIA GPUs under Deep
  Learning Workloads
Characterizing Concurrency Mechanisms for NVIDIA GPUs under Deep Learning Workloads
Guin Gilman
R. Walls
GNN
BDL
36
17
0
01 Oct 2021
Multi-model Machine Learning Inference Serving with GPU Spatial
  Partitioning
Multi-model Machine Learning Inference Serving with GPU Spatial Partitioning
S. Choi
Sunho Lee
Yeonjae Kim
Jongse Park
Youngjin Kwon
Jaehyuk Huh
30
21
0
01 Sep 2021
Boggart: Towards General-Purpose Acceleration of Retrospective Video
  Analytics
Boggart: Towards General-Purpose Acceleration of Retrospective Video Analytics
Neil Agarwal
Ravi Netravali
27
14
0
21 Jun 2021
Contention-Aware GPU Partitioning and Task-to-Partition Allocation for
  Real-Time Workloads
Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads
Houssam-Eddine Zahaf
Ignacio Sañudo Olmedo
Jayati Singh
Nicola Capodieci
Sébastien Faucou
18
10
0
21 May 2021
Accelerating Multi-Model Inference by Merging DNNs of Different Weights
Accelerating Multi-Model Inference by Merging DNNs of Different Weights
Joo Seong Jeong
Soojeong Kim
Gyeong-In Yu
Yunseong Lee
Byung-Gon Chun
FedML
MoMe
AI4CE
13
7
0
28 Sep 2020
Spatial Sharing of GPU for Autotuning DNN models
Spatial Sharing of GPU for Autotuning DNN models
Aditya Dhakal
Junguk Cho
Sameer G. Kulkarni
K. Ramakrishnan
P. Sharma
19
8
0
08 Aug 2020
Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for
  CNN Models
Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models
Matthew LeMay
Shijian Li
Tian Guo
14
25
0
05 Dec 2019
INFaaS: A Model-less and Managed Inference Serving System
INFaaS: A Model-less and Managed Inference Serving System
Francisco Romero
Qian Li
N. Yadwadkar
Christos Kozyrakis
31
14
0
30 May 2019
The OoO VLIW JIT Compiler for GPU Inference
The OoO VLIW JIT Compiler for GPU Inference
Paras Jain
Xiangxi Mo
Ajay Jain
Alexey Tumanov
Joseph E. Gonzalez
Ion Stoica
33
17
0
28 Jan 2019
1