ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.12217
  4. Cited By
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging

Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging

17 February 2025
Zhixiang Wang
Zhenyu Mao
Yixuan Qiao
Hao Sun
Biye Li
    MoMe
ArXivPDFHTML

Papers citing "Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging"

25 / 25 papers shown
Title
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
276
121
0
28 Jan 2025
Beyond Task Vectors: Selective Task Arithmetic Based on Importance
  Metrics
Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics
Tian Bowen
Lai Songning
Wu Jiemin
Shuai Zhihao
Ge Shiming
Yue Yutao
MoMe
108
5
0
25 Nov 2024
FuseChat: Knowledge Fusion of Chat Models
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALM
KELM
MoMe
81
25
0
15 Aug 2024
DogeRM: Equipping Reward Models with Domain Knowledge through Model
  Merging
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin
Chen-An Li
Hung-yi Lee
Yun-Nung Chen
VLM
ALM
42
5
0
01 Jul 2024
DELLA-Merging: Reducing Interference in Model Merging through
  Magnitude-Based Sampling
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Pala Tej Deep
Rishabh Bhardwaj
Soujanya Poria
MoMe
73
28
0
17 Jun 2024
Localizing Task Information for Improved Model Merging and Compression
Localizing Task Information for Improved Model Merging and Compression
Ke Wang
Nikolaos Dimitriadis
Guillermo Ortiz-Jimenez
Franccois Fleuret
Pascal Frossard
MoMe
67
57
0
13 May 2024
Model Stock: All we need is just a few fine-tuned models
Model Stock: All we need is just a few fine-tuned models
Dong-Hwan Jang
Sangdoo Yun
Dongyoon Han
OODD
MoMe
75
42
0
28 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
77
17
0
28 Mar 2024
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
78
64
0
11 Dec 2023
Controlled Text Generation via Language Model Arithmetic
Controlled Text Generation via Language Model Arithmetic
Jasper Dekoninck
Marc Fischer
Luca Beurer-Kellner
Martin Vechev
65
40
0
24 Nov 2023
Language Models are Super Mario: Absorbing Abilities from Homologous
  Models as a Free Lunch
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
104
320
0
06 Nov 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRM
OSLM
203
455
0
18 Aug 2023
Separate the Wheat from the Chaff: Model Deficiency Unlearning via
  Parameter-Efficient Module Operation
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation
Xinshuo Hu
Dongfang Li
Baotian Hu
Zihao Zheng
Zhenyu Liu
Hao Fei
KELM
MU
85
30
0
16 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
299
11,894
0
18 Jul 2023
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for
  Foundation Models
C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models
Yuzhen Huang
Yuzhuo Bai
Zhihao Zhu
Junlei Zhang
Jinghan Zhang
...
Yikai Zhang
Jiayi Lei
Yao Fu
Maosong Sun
Junxian He
ELM
LRM
71
539
0
15 May 2023
Robust Weight Signatures: Gaining Robustness as Easy as Patching
  Weights?
Robust Weight Signatures: Gaining Robustness as Easy as Patching Weights?
Ruisi Cai
Zhenyu Zhang
Zhangyang Wang
AAML
OOD
71
12
0
24 Feb 2023
Dataless Knowledge Fusion by Merging Weights of Language Models
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preoţiuc-Pietro
Pengxiang Cheng
FedML
MoMe
73
241
0
19 Dec 2022
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained
  Transformers
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
Elias Frantar
Saleh Ashkboos
Torsten Hoefler
Dan Alistarh
MQ
124
989
0
31 Oct 2022
Optimal Brain Compression: A Framework for Accurate Post-Training
  Quantization and Pruning
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
98
237
0
24 Aug 2022
Model soups: averaging weights of multiple fine-tuned models improves
  accuracy without increasing inference time
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mitchell Wortsman
Gabriel Ilharco
S. Gadre
Rebecca Roelofs
Raphael Gontijo-Lopes
...
Hongseok Namkoong
Ali Farhadi
Y. Carmon
Simon Kornblith
Ludwig Schmidt
MoMe
146
989
1
10 Mar 2022
Merging Models with Fisher-Weighted Averaging
Merging Models with Fisher-Weighted Averaging
Michael Matena
Colin Raffel
FedML
MoMe
87
393
0
18 Nov 2021
Training Verifiers to Solve Math Word Problems
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
285
4,408
0
27 Oct 2021
Program Synthesis with Large Language Models
Program Synthesis with Large Language Models
Jacob Austin
Augustus Odena
Maxwell Nye
Maarten Bosma
Henryk Michalewski
...
Ellen Jiang
Carrie J. Cai
Michael Terry
Quoc V. Le
Charles Sutton
ELM
AIMat
ReCod
ALM
195
1,954
0
16 Aug 2021
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
229
5,539
0
07 Jul 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
D. Song
Jacob Steinhardt
ReLM
FaML
173
2,265
0
05 Mar 2021
1