Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.17525
Cited By
Pushing the Limits of Large Language Model Quantization via the Linearity Theorem
26 November 2024
Vladimir Malinovskii
Andrei Panferov
Ivan Ilin
Han Guo
Peter Richtárik
Dan Alistarh
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pushing the Limits of Large Language Model Quantization via the Linearity Theorem"
6 / 6 papers shown
Title
Addition is almost all you need: Compressing neural networks with double binary factorization
Vladimír Boža
Vladimír Macko
MQ
17
0
0
16 May 2025
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
Jinuk Kim
Marwa El Halabi
W. Park
Clemens JS Schaefer
Deokjae Lee
Yeonhong Park
Jae W. Lee
Hyun Oh Song
MQ
34
0
0
11 May 2025
Towards Quantifying the Hessian Structure of Neural Networks
Zhaorui Dong
Yushun Zhang
Zhi-Quan Luo
Jianfeng Yao
Ruoyu Sun
31
0
0
05 May 2025
Hessian of Perplexity for Large Language Models by PyTorch autograd (Open Source)
Ivan Ilin
26
0
0
06 Apr 2025
SQuat: Subspace-orthogonal KV Cache Quantization
Hao Wang
Ligong Han
Kai Xu
Akash Srivastava
MQ
51
0
0
31 Mar 2025
CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models
Guanduo Chen
Yutong He
Yipeng Hu
Kun Yuan
Binhang Yuan
54
0
0
03 Feb 2025
1