Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.17985
Cited By
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
28 February 2024
Yi Zhang
Fei Yang
Shuang Peng
Fangyu Wang
Aimin Pan
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization"
1 / 1 papers shown
Title
Mamba-PTQ: Outlier Channels in Recurrent Large Language Models
Alessandro Pierro
Steven Abreu
MQ
Mamba
43
6
0
17 Jul 2024
1