Papers
Communities
Organizations
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.00491
Cited By
Tutorial Proposal: Speculative Decoding for Efficient LLM Inference
1 March 2025
Heming Xia
Cunxiao Du
Yongqian Li
Qian Liu
Wenjie Li
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Tutorial Proposal: Speculative Decoding for Efficient LLM Inference"
3 / 3 papers shown
Title
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Zilong Wang
Zifeng Wang
Long Le
Huaixiu Steven Zheng
Swaroop Mishra
...
Anush Mattapalli
Ankur Taly
Jingbo Shang
Zifeng Wang
Tomas Pfister
RALM
142
46
0
11 Jul 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
209
165
0
26 Jan 2024
Fast Transformer Decoding: One Write-Head is All You Need
Noam M. Shazeer
188
479
0
06 Nov 2019
1