S $^4$ C: Speculative Sampling with Syntactic and Semantic Coherence for Efficient Inference of Large Language Models

17 June 2025

Papers citing "S$^4$C: Speculative Sampling with Syntactic and Semantic Coherence for Efficient Inference of Large Language Models"

8 / 8 papers shown

Title
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty Yuhui Li Fangyun Wei Chao Zhang Hongyang R. Zhang 117 165 0 26 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads Tianle Cai Yuhong Li Zhengyang Geng Hongwu Peng Jason D. Lee De-huai Chen Tri Dao 131 313 0 19 Jan 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding Heming Xia Zhe Yang Qingxiu Dong Peiyi Wang Chak Tou Leong Tao Ge Tianyu Liu Wenjie Li Zhifang Sui LRM 122 129 0 15 Jan 2024
TinyLlama: An Open-Source Small Language Model Peiyuan Zhang Guangtao Zeng Tianduo Wang Wei Lu ALM LRM 142 406 0 04 Jan 2024
PaSS: Parallel Speculative Sampling Giovanni Monea Armand Joulin Edouard Grave MoE 67 38 0 22 Nov 2023
Accelerating LLM Inference with Staged Speculative Decoding Benjamin Spector Christal Re 69 112 0 08 Aug 2023
Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding Seongjun Yang Gibbeum Lee Jaewoong Cho Dimitris Papailiopoulos Kangwook Lee 80 38 0 12 Jul 2023
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena Lianmin Zheng Wei-Lin Chiang Ying Sheng Siyuan Zhuang Zhanghao Wu ... Dacheng Li Eric Xing Haotong Zhang Joseph E. Gonzalez Ion Stoica ALM OSLM ELM 393 4,422 0 09 Jun 2023