ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.12038
33
0

LCD: Advancing Extreme Low-Bit Clustering for Large Language Models via Knowledge Distillation

23 May 2025
Fangxin Liu
Ning Yang
Junping Zhao
Tao Yang
Haibing Guan
Li Jiang
    MQ
ArXiv (abs)PDFHTML
Main:8 Pages
8 Figures
Bibliography:2 Pages
5 Tables
Abstract

Large language models (LLMs) have achieved significant progress in natural language processing but face challenges in deployment due to high memory and computational requirements. Weight quantization is a common approach to address these issues, yet achieving effective low-bit compression remains challenging. This paper presents LCD, which unifies the learning of clustering-based quantization within a knowledge distillation framework. Using carefully designed optimization techniques, LCD preserves LLM performance even at ultra-low bit widths of 2-3 bits. Additionally, LCD compresses activations through smoothing and accelerates inference with a LUT-based design. Experimental results show that LCD outperforms existing methods and delivers up to a 6.2x speedup in inference. Notably, LCD is shown to be more cost-effective, making it a practical solution for real-world applications.

View on arXiv
@article{liu2025_2506.12038,
  title={ LCD: Advancing Extreme Low-Bit Clustering for Large Language Models via Knowledge Distillation },
  author={ Fangxin Liu and Ning Yang and Junping Zhao and Tao Yang and Haibing Guan and Li Jiang },
  journal={arXiv preprint arXiv:2506.12038},
  year={ 2025 }
}
Comments on this paper