ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2411.17525
  4. Cited By
Pushing the Limits of Large Language Model Quantization via the
  Linearity Theorem

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

26 November 2024
Vladimir Malinovskii
Andrei Panferov
Ivan Ilin
Han Guo
Peter Richtárik
Dan Alistarh
    MQ
ArXivPDFHTML

Papers citing "Pushing the Limits of Large Language Model Quantization via the Linearity Theorem"

6 / 6 papers shown
Title
Addition is almost all you need: Compressing neural networks with double binary factorization
Addition is almost all you need: Compressing neural networks with double binary factorization
Vladimír Boža
Vladimír Macko
MQ
17
0
0
16 May 2025
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance
Jinuk Kim
Marwa El Halabi
W. Park
Clemens JS Schaefer
Deokjae Lee
Yeonhong Park
Jae W. Lee
Hyun Oh Song
MQ
34
0
0
11 May 2025
Towards Quantifying the Hessian Structure of Neural Networks
Towards Quantifying the Hessian Structure of Neural Networks
Zhaorui Dong
Yushun Zhang
Zhi-Quan Luo
Jianfeng Yao
Ruoyu Sun
31
0
0
05 May 2025
Hessian of Perplexity for Large Language Models by PyTorch autograd (Open Source)
Hessian of Perplexity for Large Language Models by PyTorch autograd (Open Source)
Ivan Ilin
26
0
0
06 Apr 2025
SQuat: Subspace-orthogonal KV Cache Quantization
SQuat: Subspace-orthogonal KV Cache Quantization
Hao Wang
Ligong Han
Kai Xu
Akash Srivastava
MQ
51
0
0
31 Mar 2025
CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models
CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Language Models
Guanduo Chen
Yutong He
Yipeng Hu
Kun Yuan
Binhang Yuan
54
0
0
03 Feb 2025
1