Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.18813
Cited By
The Synergy of Speculative Decoding and Batching in Serving Large Language Models
28 October 2023
Qidong Su
Christina Giannoula
Gennady Pekhimenko
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Synergy of Speculative Decoding and Batching in Serving Large Language Models"
3 / 3 papers shown
Title
PAPI: Exploiting Dynamic Parallelism in Large Language Model Decoding with a Processing-In-Memory-Enabled Computing System
Yintao He
Haiyu Mao
Christina Giannoula
Mohammad Sadrosadati
Juan Gómez Luna
Huawei Li
Xiaowei Li
Ying Wang
O. Mutlu
48
6
0
21 Feb 2025
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Jian Chen
Vashisth Tiwari
Ranajoy Sadhukhan
Zhuoming Chen
Jinyuan Shi
Ian En-Hsu Yen
Ian En-Hsu Yen
Avner May
Tianqi Chen
Beidi Chen
LRM
44
22
0
20 Aug 2024
Decoding Speculative Decoding
Minghao Yan
Saurabh Agarwal
Shivaram Venkataraman
LRM
45
6
0
02 Feb 2024
1