ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.16330
122
2
v1v2 (latest)

Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging

24 June 2024
Deyuan Liu
Zhan Qin
Han Wang
Zhao Yang
Zecheng Wang
Fangying Rong
Qingbin Liu
Yanchao Hao
Xi Chen
Cunhang Fan
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
ArXiv (abs)PDFHTML
Main:10 Pages
7 Figures
Appendix:3 Pages
Abstract

While large language models (LLMs) excel in many domains, their complexity and scale challenge deployment in resource-limited environments. Current compression techniques, such as parameter pruning, often fail to effectively utilize the knowledge from pruned parameters. To address these challenges, we propose Manifold-Based Knowledge Alignment and Layer Merging Compression (MKA), a novel approach that uses manifold learning and the Normalized Pairwise Information Bottleneck (NPIB) measure to merge similar layers, reducing model size while preserving essential performance. We evaluate MKA on multiple benchmark datasets and various LLMs. Our findings show that MKA not only preserves model performance but also achieves substantial compression ratios, outperforming traditional pruning methods. Moreover, when coupled with quantization, MKA delivers even greater compression. Specifically, on the MMLU dataset using the Llama3-8B model, MKA achieves a compression ratio of 43.75% with a minimal performance decrease of only 2.82\%. The proposed MKA method offers a resource-efficient and performance-preserving model compression technique for LLMs.

View on arXiv
@article{liu2025_2406.16330,
  title={ Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging },
  author={ Deyuan Liu and Zhanyue Qin and Hairu Wang and Zhao Yang and Zecheng Wang and Fangying Rong and Qingbin Liu and Yanchao Hao and Xi Chen and Cunhang Fan and Zhao Lv and Zhiying Tu and Dianhui Chu and Bo Li and Dianbo Sui },
  journal={arXiv preprint arXiv:2406.16330},
  year={ 2025 }
}
Comments on this paper