TuneComp: Joint Fine-tuning and Compression for Large Foundation Models

27 May 2025

Main:4 Pages

3 Figures

Bibliography:2 Pages

2 Tables

Abstract

To reduce model size during post-training, compression methods, including knowledge distillation, low-rank approximation, and pruning, are often applied after fine-tuning the model. However, sequential fine-tuning and compression sacrifices performance, while creating a larger than necessary model as an intermediate step. In this work, we aim to reduce this gap, by directly constructing a smaller model while guided by the downstream task. We propose to jointly fine-tune and compress the model by gradually distilling it to a pruned low-rank structure. Experiments demonstrate that joint fine-tuning and compression significantly outperforms other sequential compression methods.

View on arXiv

@article{chen2025_2505.21835,
  title={ TuneComp: Joint Fine-tuning and Compression for Large Foundation Models },
  author={ Xiangyu Chen and Jing Liu and Ye Wang and Matthew Brand and Wang and Toshiaki Koike-Akino },
  journal={arXiv preprint arXiv:2505.21835},
  year={ 2025 }
}

Comments on this paper