ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.06082
  4. Cited By
SubGen: Token Generation in Sublinear Time and Memory

SubGen: Token Generation in Sublinear Time and Memory

8 February 2024
A. Zandieh
Insu Han
Vahab Mirrokni
Amin Karbasi
ArXivPDFHTML

Papers citing "SubGen: Token Generation in Sublinear Time and Memory"

4 / 4 papers shown
Title
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate
A. Zandieh
Majid Daliri
Majid Hadian
Vahab Mirrokni
MQ
74
0
0
28 Apr 2025
HSR-Enhanced Sparse Attention Acceleration
HSR-Enhanced Sparse Attention Acceleration
Bo Chen
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
95
18
0
14 Oct 2024
Hardness of Low Rank Approximation of Entrywise Transformed Matrix
  Products
Hardness of Low Rank Approximation of Entrywise Transformed Matrix Products
Tamás Sarlós
Xingyou Song
David P. Woodruff
Qiuyi
Qiuyi Zhang
34
3
0
03 Nov 2023
FlexGen: High-Throughput Generative Inference of Large Language Models
  with a Single GPU
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
149
369
0
13 Mar 2023
1