Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.17375
Cited By
AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration
22 October 2024
Bradley McDanel
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Github (5★)
Papers citing
"AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Acceleration"
5 / 5 papers shown
Title
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li
Fangyun Wei
Chao Zhang
Hongyang R. Zhang
142
165
0
26 Jan 2024
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai
Yuhong Li
Zhengyang Geng
Hongwu Peng
Jason D. Lee
De-huai Chen
Tri Dao
172
314
0
19 Jan 2024
Unlocking Efficiency in Large Language Model Inference: A Comprehensive Survey of Speculative Decoding
Heming Xia
Zhe Yang
Qingxiu Dong
Peiyi Wang
Chak Tou Leong
Tao Ge
Tianyu Liu
Wenjie Li
Zhifang Sui
LRM
141
129
0
15 Jan 2024
Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Siyuan Zhuang
Zhanghao Wu
...
Dacheng Li
Eric Xing
Haotong Zhang
Joseph E. Gonzalez
Ion Stoica
ALM
OSLM
ELM
450
4,444
0
09 Jun 2023
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
238
5,665
0
07 Jul 2021
1