PEARL: Parallel Speculative Decoding with Adaptive Draft Length

PEARL: Parallel Speculative Decoding with Adaptive Draft Length

13 August 2024

Papers citing "PEARL: Parallel Speculative Decoding with Adaptive Draft Length"

11 / 11 papers shown

Title
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning Yige Xu Xu Guo Zhiwei Zeng Chunyan Miao BDL LRM 9 0 0 16 May 2025
SpecRouter: Adaptive Routing for Multi-Level Speculative Decoding in Large Language Models Hang Wu Jianian Zhu Yongqian Li Haojie Wang Biao Hou Jidong Zhai 40 0 0 12 May 2025
Token-Driven GammaTune: Adaptive Calibration for Enhanced Speculative Decoding Aayush Gautam Susav Shrestha Narasimha Annapareddy 48 0 0 28 Mar 2025
Exploiting Edited Large Language Models as General Scientific Optimizers Qitan Lv T. Liu Haoyu Wang 41 0 0 08 Mar 2025
DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting Kai Lv Honglin Guo Qipeng Guo Xipeng Qiu 41 0 0 02 Mar 2025
Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff Maximilian Holsman Yukun Huang Bhuwan Dhingra 39 0 0 28 Feb 2025
Speculative Decoding and Beyond: An In-Depth Survey of Techniques Y. Hu Zining Liu Zhenyuan Dong Tianfan Peng Bradley McDanel S. Zhang 93 0 0 27 Feb 2025
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding Zhaoxuan Wu Zijian Zhou Arun Verma Alok Prakash Daniela Rus Bryan Kian Hsiang Low 60 0 0 24 Feb 2025
Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment Gregor Bachmann Sotiris Anagnostidis Albert Pumarola Markos Georgopoulos A. Sanakoyeu Yuming Du Edgar Schönfeld Ali K. Thabet Jonas Kohler ALM BDL 93 6 0 31 Jan 2025
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding Zilin Xiao Hongming Zhang Tao Ge Siru Ouyang Vicente Ordonez Dong Yu 39 5 0 08 Oct 2024
Dynamic Depth Decoding: Faster Speculative Decoding for LLMs Oscar Brown Zhengjie Wang Andrea Do Nikhil Mathew Cheng Yu 26 4 0 30 Aug 2024