Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2401.10774
Cited By
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
19 January 2024
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads"
3 / 53 papers shown
Title
Decoding Speculative Decoding
Minghao Yan
Saurabh Agarwal
Shivaram Venkataraman
LRM
27
5
0
02 Feb 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
311
11,915
0
04 Mar 2022
Locally Typical Sampling
Clara Meister
Tiago Pimentel
Gian Wiher
Ryan Cotterell
140
86
0
01 Feb 2022
Previous
1
2