Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.13720
Cited By
Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding
21 February 2024
Weilin Zhao
Yuxiang Huang
Xu Han
Wang Xu
Chaojun Xiao
Xinrong Zhang
Yewei Fang
Kaihuo Zhang
Zhiyuan Liu
Maosong Sun
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Ouroboros: Generating Longer Drafts Phrase by Phrase for Faster Speculative Decoding"
5 / 5 papers shown
Title
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Yixin Song
Zeyu Mi
Haotong Xie
Haibo Chen
BDL
125
120
0
16 Dec 2023
Distilling Linguistic Context for Language Model Compression
Geondo Park
Gyeongman Kim
Eunho Yang
48
38
0
17 Sep 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
377
0
23 Jul 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
Teaching Machines to Read and Comprehend
Karl Moritz Hermann
Tomás Kociský
Edward Grefenstette
L. Espeholt
W. Kay
Mustafa Suleyman
Phil Blunsom
184
3,513
0
10 Jun 2015
1