Tutorial Proposal: Speculative Decoding for Efficient LLM Inference

1 March 2025

Papers citing "Tutorial Proposal: Speculative Decoding for Efficient LLM Inference"

3 / 3 papers shown

Title
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting Zilong Wang Zifeng Wang Long Le Huaixiu Steven Zheng Swaroop Mishra ... Anush Mattapalli Ankur Taly Jingbo Shang Zifeng Wang Tomas Pfister RALM 190 53 0 11 Jul 2024
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty Yuhui Li Fangyun Wei Chao Zhang Hongyang R. Zhang 272 202 0 26 Jan 2024
Fast Transformer Decoding: One Write-Head is All You Need Noam M. Shazeer 280 534 0 06 Nov 2019