14
0

Training-free LLM Merging for Multi-task Learning

Main:8 Pages
7 Figures
Bibliography:3 Pages
11 Tables
Appendix:3 Pages
Abstract

Large Language Models (LLMs) have demonstrated exceptional capabilities across diverse natural language processing (NLP) tasks. The release of open-source LLMs like LLaMA and Qwen has triggered the development of numerous fine-tuned models tailored for various tasks and languages. In this paper, we explore an important question: is it possible to combine these specialized models to create a unified model with multi-task capabilities. We introduces Hierarchical Iterative Merging (Hi-Merging), a training-free method for unifying different specialized LLMs into a single model. Specifically, Hi-Merging employs model-wise and layer-wise pruning and scaling, guided by contribution analysis, to mitigate parameter conflicts. Extensive experiments on multiple-choice and question-answering tasks in both Chinese and English validate Hi-Merging's ability for multi-task learning. The results demonstrate that Hi-Merging consistently outperforms existing merging techniques and surpasses the performance of models fine-tuned on combined datasets in most scenarios. Code is available at:this https URL.

View on arXiv
@article{fu2025_2506.12379,
  title={ Training-free LLM Merging for Multi-task Learning },
  author={ Zichuan Fu and Xian Wu and Yejing Wang and Wanyu Wang and Shanshan Ye and Hongzhi Yin and Yi Chang and Yefeng Zheng and Xiangyu Zhao },
  journal={arXiv preprint arXiv:2506.12379},
  year={ 2025 }
}
Comments on this paper