Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.02530
Cited By
A Comprehensive Study on Quantization Techniques for Large Language Models
30 October 2024
Jiedong Lang
Zhehao Guo
Shuyu Huang
MQ
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"A Comprehensive Study on Quantization Techniques for Large Language Models"
6 / 6 papers shown
Title
Intelligent Orchestration of Distributed Large Foundation Model Inference at the Edge
Fernando Koch
Aladin Djuhera
Alecio Binotto
74
0
0
01 Jul 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
95
0
0
13 May 2025
ConTextual: Improving Clinical Text Summarization in LLMs with Context-preserving Token Filtering and Knowledge Graphs
Fahmida Liza Piya
Rahmatollah Beheshti
286
0
0
23 Apr 2025
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
Deyu Cao
Samin Aref
MQ
80
0
0
14 Apr 2025
Quantization Error Propagation: Revisiting Layer-Wise Post-Training Quantization
Yamato Arai
Yuma Ichikawa
MQ
105
0
0
13 Apr 2025
Scaling Laws for Floating Point Quantization Training
Xingwu Sun
Shuaipeng Li
Ruobing Xie
Weidong Han
Kan Wu
...
Yangyu Tao
Zhanhui Kang
C. Xu
Di Wang
Jie Jiang
MQ
AIFin
128
2
0
05 Jan 2025
1