Efficient Inference for Large Language Model-based Generative Recommendation
v1v2v3 (latest)

Efficient Inference for Large Language Model-based Generative Recommendation

Papers citing "Efficient Inference for Large Language Model-based Generative Recommendation"

35 / 35 papers shown
Title
Online Speculative Decoding
Online Speculative Decoding
Xiaoxuan Liu
Lanxiang Hu
Peter Bailis
Alvin Cheung
Zhijie Deng
Ion Stoica
Hao Zhang
145
62
0
11 Oct 2023

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.