Towards Fast Multilingual LLM Inference: Speculative Decoding and
  Specialized Drafters

Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters

Papers citing "Towards Fast Multilingual LLM Inference: Speculative Decoding and Specialized Drafters"

13 / 13 papers shown
Title
Mixture of Attentions For Speculative Decoding
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer
Milan Gritta
Gerasimos Lampouras
Haitham Bou Ammar
Jun Wang
150
6
0
04 Oct 2024