Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.10892
Cited By
Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems
21 April 2023
Mehran Salmani
Saeid Ghafouri
Alireza Sanaee
Kamran Razavi
M. Muhlhauser
Joseph Doyle
Pooyan Jamshidi
U. O. N. Carolina
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference Serving Systems"
4 / 4 papers shown
Title
EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge
Motahare Mounesan
Xiaojie Zhang
S. Debroy
21
1
0
16 Oct 2024
Sponge: Inference Serving with Dynamic SLOs Using In-Place Vertical Scaling
Kamran Razavi
Saeid Ghafouri
Max Mühlhäuser
Pooyan Jamshidi
Lin Wang
29
3
0
31 Mar 2024
Resource Allocation of Industry 4.0 Micro-Service Applications across Serverless Fog Federation
R. Hussain
Mohsen Amini Salehi
25
12
0
14 Jan 2024
IPA: Inference Pipeline Adaptation to Achieve High Accuracy and Cost-Efficiency
Saeid Ghafouri
Kamran Razavi
Mehran Salmani
Alireza Sanaee
T. Lorido-Botran
Lin Wang
Joseph Doyle
Pooyan Jamshidi
35
2
0
24 Aug 2023
1