SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths

SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths

30 May 2024

Kaixuan Huang

Mengdi Wang

Papers citing "SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths"

8 / 8 papers shown

Title
DEL: Context-Aware Dynamic Exit Layer for Efficient Self-Speculative Decoding Hossein Entezari Zarch Lei Gao Chaoyi Jiang Murali Annavaram LRM 31 0 0 08 Apr 2025
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment Gregor Bachmann Sotiris Anagnostidis Albert Pumarola Markos Georgopoulos A. Sanakoyeu Yuming Du Edgar Schönfeld Ali K. Thabet Jonas Kohler ALM BDL 95 6 0 31 Jan 2025
Mixture of Attentions For Speculative Decoding Matthieu Zimmer Milan Gritta Gerasimos Lampouras Haitham Bou Ammar Jun Wang 76 4 0 04 Oct 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models Jinliang Lu Ziliang Pang Min Xiao Yaochen Zhu Rui Xia Jiajun Zhang MoMe 49 18 0 08 Jul 2024
SnapKV: LLM Knows What You are Looking for Before Generation Yuhong Li Yingbing Huang Bowen Yang Bharat Venkitesh Acyr F. Locatelli Hanchen Ye Tianle Cai Patrick Lewis Deming Chen VLM 79 157 0 22 Apr 2024
Speculative Streaming: Fast LLM Inference without Auxiliary Models Nikhil Bhendawade Irina Belousova Qichen Fu Henry Mason Mohammad Rastegari Mahyar Najibi LRM 34 28 0 16 Feb 2024
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding Yichao Fu Peter Bailis Ion Stoica Hao Zhang 130 141 0 03 Feb 2024
Diffusion-LM Improves Controllable Text Generation Xiang Lisa Li John Thickstun Ishaan Gulrajani Percy Liang Tatsunori B. Hashimoto AI4CE 173 777 0 27 May 2022