Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging

17 February 2025

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities, but their high computational costs pose challenges for customization. Model merging offers a cost-effective alternative, yet existing methods suffer from interference among parameters, leading to performance degradation. In this work, we propose Optimal Brain Iterative Merging (OBIM), a novel method designed to mitigate both intra-model and inter-model interference. OBIM consists of two key components: (1) A saliency measurement mechanism that evaluates parameter importance based on loss changes induced by individual weight alterations, reducing intra-model interference by preserving only high-saliency parameters. (2) A mutually exclusive iterative merging framework, which incrementally integrates models using a binary mask to avoid direct parameter averaging, thereby mitigating inter-model interference. We validate OBIM through experiments on both Supervised Fine-Tuned (SFT) models and post-pretrained checkpoints. The results show that OBIM significantly outperforms existing merging techniques. Overall, OBIM provides an effective and practical solution for enhancing LLM merging.

View on arXiv

@article{wang2025_2502.12217,
  title={ Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging },
  author={ Zhixiang Wang and Zhenyu Mao and Yixuan Qiao and Yunfang Wu and Biye Li },
  journal={arXiv preprint arXiv:2502.12217},
  year={ 2025 }
}

Comments on this paper