Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1904.01576
Cited By
BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services
2 April 2019
Anirban Bhattacharjee
A. Chhokra
Zhuangwei Kang
Hongyang Sun
A. Gokhale
G. Karsai
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services"
18 / 18 papers shown
Title
SkyServe: Serving AI Models across Regions and Clouds with Spot Instances
Ziming Mao
Tian Xia
Zhanghao Wu
Wei-Lin Chiang
Tyler Griggs
Romil Bhardwaj
Zongheng Yang
S. Shenker
Ion Stoica
62
2
0
03 Nov 2024
Tally: Non-Intrusive Performance Isolation for Concurrent Deep Learning Workloads
Wei Zhao
Anand Jayarajan
Gennady Pekhimenko
FedML
46
1
0
09 Oct 2024
A House United Within Itself: SLO-Awareness for On-Premises Containerized ML Inference Clusters via Faro
Beomyeol Jeon
Chen Wang
Diana Arroyo
Alaa Youssef
Indranil Gupta
39
0
0
29 Sep 2024
HarmonyBatch: Batching multi-SLO DNN Inference with Heterogeneous Serverless Functions
Jiabin Chen
Fei Xu
Yikun Gu
Li Chen
Fangming Liu
Zhi Zhou
29
6
0
09 May 2024
MOPAR: A Model Partitioning Framework for Deep Learning Inference Services on Serverless Platforms
Jiaang Duan
Shiyou Qian
Dingyu Yang
Hanwen Hu
Jian Cao
Guangtao Xue
MoE
45
1
0
03 Apr 2024
Unveiling the frontiers of deep learning: innovations shaping diverse domains
Shams Forruque Ahmed
Md. Sakib Bin Alam
Maliha Kabir
Shaila Afrin
Sabiha Jannat Rafa
Aanushka Mehjabin
Amir H. Gandomi
AI4CE
42
2
0
06 Sep 2023
AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving
Zhuohan Li
Lianmin Zheng
Yinmin Zhong
Vincent Liu
Ying Sheng
...
Yanping Huang
Zhifeng Chen
Hao Zhang
Joseph E. Gonzalez
Ion Stoica
MoE
21
68
0
22 Feb 2023
Orloj: Predictably Serving Unpredictable DNNs
Peifeng Yu
Yuqing Qiu
Xin Jin
Mosharaf Chowdhury
22
1
0
31 Aug 2022
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision
Wei Gao
Qi Hu
Zhisheng Ye
Peng Sun
Xiaolin Wang
Yingwei Luo
Tianwei Zhang
Yonggang Wen
88
26
0
24 May 2022
SMLT: A Serverless Framework for Scalable and Adaptive Machine Learning Design and Training
Ahsan Ali
Syed Zawad
Paarijaat Aditya
Istemi Ekin Akkus
Ruichuan Chen
Feng Yan
34
9
0
04 May 2022
Jarvis: Large-scale Server Monitoring with Adaptive Near-data Processing
Atul Sandur
Chanho Park
Stavros Volos
Gul Agha
Myeongjae Jeon
22
12
0
12 Feb 2022
On the Future of Cloud Engineering
David Bermbach
A. Chandra
C. Krintz
A. Gokhale
Aleksander Slominski
L. Thamsen
Everton Cavalcante
Tian Guo
Ivona Brandić
R. Wolski
43
23
0
19 Aug 2021
A Holistic View on Resource Management in Serverless Computing Environments: Taxonomy and Future Directions
Anupama Mampage
S. Karunasekera
Rajkumar Buyya
46
72
0
25 May 2021
Distributed Double Machine Learning with a Serverless Architecture
Malte S. Kurz
21
15
0
11 Jan 2021
A Review of Serverless Use Cases and their Characteristics
Simon Eismann
Joel Scheuner
Erwin Van Eyk
Maximilian Schwinger
Johannes Grohmann
N. Herbst
Cristina L. Abad
Alexandru Iosup
11
76
0
25 Aug 2020
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up
A. Gujarati
Reza Karimi
Safya Alzayat
Wei Hao
Antoine Kaufmann
Ymir Vigfusson
Jonathan Mace
37
271
0
03 Jun 2020
Deep-Edge: An Efficient Framework for Deep Learning Model Update on Heterogeneous Edge
Anirban Bhattacharjee
A. Chhokra
Hongyang Sun
Shashank Shekhar
A. Gokhale
G. Karsai
A. Dubey
8
7
0
13 Apr 2020
Perseus: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models
Matthew LeMay
Shijian Li
Tian Guo
14
25
0
05 Dec 2019
1