ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.17074
  4. Cited By

Semi-Clairvoyant Scheduling of Speculative Decoding Requests to Minimize LLM Inference Latency

20 May 2025
Ruixiao Li
Fahao Chen
Peng Li
ArXivPDFHTML

Papers citing "Semi-Clairvoyant Scheduling of Speculative Decoding Requests to Minimize LLM Inference Latency"

8 / 8 papers shown
Title
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length
  Prediction
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction
Haoran Qiu
Weichao Mao
Archit Patke
Shengkun Cui
Saurabh Jha
Chen Wang
Hubertus Franke
Zbigniew T. Kalbarczyk
Tamer Basar
Ravishankar K. Iyer
54
26
0
12 Apr 2024
ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference
ExeGPT: Constraint-Aware Resource Scheduling for LLM Inference
Hyungjun Oh
Kihong Kim
Jaemin Kim
Sungkyun Kim
Junyeol Lee
Du-Seong Chang
Jiwon Seo
64
35
0
15 Mar 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple
  Decoding Heads
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
114
296
0
19 Jan 2024
Accelerating LLM Inference with Staged Speculative Decoding
Accelerating LLM Inference with Staged Speculative Decoding
Benjamin Spector
Christal Re
63
107
0
08 Aug 2023
Large Language Models
Large Language Models
Michael R Douglas
LLMAG
LM&MA
129
625
0
11 Jul 2023
Fast Inference from Transformers via Speculative Decoding
Fast Inference from Transformers via Speculative Decoding
Yaniv Leviathan
Matan Kalman
Yossi Matias
LRM
111
702
0
30 Nov 2022
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELM
AIMat
ReCod
ALM
193
1,948
0
16 Aug 2021
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
741
41,894
0
28 May 2020
1