Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.08696
Cited By
Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
16 August 2024
Xianzhen Luo
Yixuan Wang
Qingfu Zhu
Zhiming Zhang
Xuanyu Zhang
Qing Yang
Dongliang Xu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling"
4 / 4 papers shown
Title
DReSD: Dense Retrieval for Speculative Decoding
Milan Gritta
Huiyin Xue
Gerasimos Lampouras
RALM
100
0
0
24 Feb 2025
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding
Sukmin Cho
S. Choi
T. Hwang
Jeongyeon Seo
Soyeong Jeong
Huije Lee
Hoyun Song
Jong C. Park
Youngjin Kwon
51
0
0
08 Feb 2025
SAM Decoding: Speculative Decoding via Suffix Automaton
Yuxuan Hu
Ke Wang
Jing Zhang
Fanjin Zhang
C. Li
H. Chen
Jing Zhang
49
1
0
16 Nov 2024
Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training
Yixuan Wang
Xianzhen Luo
Fuxuan Wei
Yijun Liu
Qingfu Zhu
Xuanyu Zhang
Qing Yang
Dongliang Xu
Wanxiang Che
40
3
0
25 Jun 2024
1