Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.08391
Cited By
KV Prediction for Improved Time to First Token
10 October 2024
Maxwell Horton
Qingqing Cao
Chenfan Sun
Yanzi Jin
Sachin Mehta
Mohammad Rastegari
Moin Nabi
AI4TS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"KV Prediction for Improved Time to First Token"
1 / 1 papers shown
Title
QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache
Rishabh Tiwari
Haocheng Xi
Aditya Tomar
Coleman Hooper
Sehoon Kim
Maxwell Horton
Mahyar Najibi
Michael W. Mahoney
Kemal Kurniawan
Amir Gholami
MQ
64
1
0
05 Feb 2025
1