Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.06082
Cited By
SubGen: Token Generation in Sublinear Time and Memory
8 February 2024
A. Zandieh
Insu Han
Vahab Mirrokni
Amin Karbasi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SubGen: Token Generation in Sublinear Time and Memory"
4 / 4 papers shown
Title
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
A. Zandieh
Majid Daliri
Majid Hadian
Vahab Mirrokni
MQ
74
0
0
28 Apr 2025
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
95
18
0
14 Oct 2024
Hardness of Low Rank Approximation of Entrywise Transformed Matrix Products
Tamás Sarlós
Xingyou Song
David P. Woodruff
Qiuyi
Qiuyi Zhang
34
3
0
03 Nov 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
149
369
0
13 Mar 2023
1