NF4 Isn't Information Theoretically Optimal (and that's Good)

12 June 2023

Papers citing "NF4 Isn't Information Theoretically Optimal (and that's Good)"

7 / 7 papers shown

Title
Improving Block-Wise LLM Quantization by 4-bit Block-Wise Optimal Float (BOF4): Analysis and Variations Patrick Blumenberg Thomas Graave Tim Fingscheidt MQ 24 0 0 10 May 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float Tianyi Zhang Yang Sui Shaochen Zhong V. Chaudhary Xia Hu Anshumali Shrivastava MQ 32 0 0 15 Apr 2025
Scaling Laws for Floating Point Quantization Training Xingchen Sun Shuaipeng Li Ruobing Xie Weidong Han Kan Wu ... Yangyu Tao Zhanhui Kang C. Xu Di Wang Jie Jiang MQ AIFin 62 0 0 05 Jan 2025
Pushing the Limits of Large Language Model Quantization via the Linearity Theorem Vladimir Malinovskii Andrei Panferov Ivan Ilin Han Guo Peter Richtárik Dan Alistarh MQ 80 7 0 26 Nov 2024
DAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs Yingsong Luo Ling Chen MQ 23 0 0 16 Oct 2024
LQ-LoRA: Low-rank Plus Quantized Matrix Decomposition for Efficient Language Model Finetuning Han Guo P. Greengard Eric P. Xing Yoon Kim MQ 36 43 0 20 Nov 2023
The Pile: An 800GB Dataset of Diverse Text for Language Modeling Leo Gao Stella Biderman Sid Black Laurence Golding Travis Hoppe ... Horace He Anish Thite Noa Nabeshima Shawn Presser Connor Leahy AIMat 282 1,996 0 31 Dec 2020