Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.17415
Cited By
Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels
25 June 2024
Razvan-Gabriel Dumitru
Vikas Yadav
Rishabh Maheshwary
Paul-Ioan Clotan
Sathwik Tejaswi Madhusudhan
Mihai Surdeanu
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels"
3 / 3 papers shown
Title
The Curse of Depth in Large Language Models
Wenfang Sun
Xinyuan Song
Pengxiang Li
Lu Yin
Yefeng Zheng
Shiwei Liu
75
4
0
09 Feb 2025
LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment
Binrui Zeng
Bin Ji
Xiaodong Liu
Jie Yu
Shasha Li
Jun Ma
Xiaopeng Li
Shangwen Wang
Xinran Hong
Yongtao Tang
MQ
42
1
0
24 Dec 2024
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
1