MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM
Serving

MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving

2 April 2024

Xingcheng Zhang

Dahua Lin

Hao Zhang

Papers citing "MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving"

4 / 4 papers shown

Title
HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location Ting Sun Penghan Wang Fan Lai 327 1 0 15 Jan 2025
MuxFlow: Efficient and Safe GPU Sharing in Large-Scale Production Deep Learning Clusters Yihao Zhao Xin Liu Shufan Liu Xiang Li Yibo Zhu Gang Huang Xuanzhe Liu Xin Jin 69 11 0 24 Mar 2023
Serving DNNs like Clockwork: Performance Predictability from the Bottom Up A. Gujarati Reza Karimi Safya Alzayat Wei Hao Antoine Kaufmann Ymir Vigfusson Jonathan Mace 62 272 0 03 Jun 2020
PyTorch: An Imperative Style, High-Performance Deep Learning Library Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury ... Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai Soumith Chintala ODL 83 42,038 0 03 Dec 2019