Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.12162
Cited By
v1
v2 (latest)
AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding
21 January 2025
Zikun Li
Zhuofu Chen
Remi Delacourt
Gabriele Oliaro
Zeyu Wang
Qinghan Chen
Shuhuai Lin
April Yang
Zhihao Zhang
Zhuoming Chen
Sean Lai
Xinhao Cheng
Xupeng Miao
Zhihao Jia
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"AdaServe: Accelerating Multi-SLO LLM Serving with SLO-Customized Speculative Decoding"
2 / 2 papers shown
Title
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
Xiangchen Li
Dimitrios Spatharakis
Saeid Ghafouri
Jiakun Fan
Dimitrios Nikolopoulos
Deepu John
Bo Ji
Dimitrios S. Nikolopoulos
52
0
0
11 Jun 2025
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Jian Chen
Vashisth Tiwari
Ranajoy Sadhukhan
Zhuoming Chen
Jinyuan Shi
Ian En-Hsu Yen
Ian En-Hsu Yen
Avner May
Tianqi Chen
Beidi Chen
LRM
154
32
0
20 Aug 2024
1