Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.12706
Cited By
Scalable Model Merging with Progressive Layer-wise Distillation
18 February 2025
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scalable Model Merging with Progressive Layer-wise Distillation"
50 / 89 papers shown
Title
Composable Cross-prompt Essay Scoring by Merging Models
Sanwoo Lee
Kun Liang
Yunfang Wu
MoMe
31
0
0
24 May 2025
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
81
3
0
15 Apr 2025
1bit-Merging: Dynamic Quantized Merging for Large Language Models
Shuqi Liu
Yuxuan Yao
Bowei He
Zehua Liu
Xiongwei Han
Mingxuan Yuan
Han Wu
Linqi Song
MoMe
MQ
111
2
0
15 Feb 2025
No Task Left Behind: Isotropic Model Merging with Common and Task-Specific Subspaces
Daniel Marczak
Simone Magistri
Sebastian Cygert
Bartłomiej Twardowski
Andrew D. Bagdanov
Joost van de Weijer
MoMe
77
4
0
07 Feb 2025
Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging
Anke Tang
Enneng Yang
Li Shen
Yong Luo
Han Hu
Bo Du
Dacheng Tao
MoMe
CLL
70
5
0
17 Jan 2025
Multi-Task Model Merging via Adaptive Weight Disentanglement
Feng Xiong
Runxi Cheng
Wang Chen
Zhanqiu Zhang
Yiwen Guo
Chun Yuan
Ruifeng Xu
MoMe
166
6
0
10 Jan 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
93
24
0
08 Jan 2025
Parameter-Efficient Interventions for Enhanced Model Merging
Marcin Osial
Daniel Marczak
Bartosz Zieliñski
MoMe
121
1
0
22 Dec 2024
Channel Merging: Preserving Specialization for Merged Experts
Mingyang Zhang
Qingbin Liu
Ganggui Ding
Xinyi Yu
L. Ou
Bohan Zhuang
CLL
MoMe
112
3
0
18 Dec 2024
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo
Donato Crisostomi
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Emanuele Rodolà
MoMe
129
14
0
26 Nov 2024
Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics
Tian Bowen
Lai Songning
Wu Jiemin
Shuai Zhihao
Ge Shiming
Yue Yutao
MoMe
95
4
0
25 Nov 2024
FREE-Merging: Fourier Transform for Efficient Model Merging
Shenghe Zheng
Hongzhi Wang
MoMe
105
1
0
25 Nov 2024
Model merging with SVD to tie the Knots
George Stoica
Pratik Ramesh
B. Ecsedi
Leshem Choshen
Judy Hoffman
MoMe
74
16
0
25 Oct 2024
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery
Enneng Yang
Li Shen
Zhenyi Wang
G. Guo
Xingwei Wang
Xiaocun Cao
Jie Zhang
Dacheng Tao
MoMe
68
6
0
18 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
148
6
0
12 Oct 2024
Merging in a Bottle: Differentiable Adaptive Merging (DAM) and the Path from Averaging to Automation
Thomas Gauthier-Caron
Shamane Siriwardhana
Elliot Stein
Malikeh Ehghaghi
Charles Goddard
Mark McQuade
Jacob Solawetz
Maxime Labonne
MoMe
54
2
0
10 Oct 2024
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALM
KELM
MoMe
64
25
0
15 Aug 2024
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement
Le Yu
Bowen Yu
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
65
6
0
06 Aug 2024
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi
Albert Manuel Orozco Camacho
Eugene Belilovsky
Guy Wolf
FedML
MoMe
47
10
0
07 Jul 2024
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling
Frederic Z. Zhang
Paul Albert
Cristian Rodriguez-Opazo
Anton van den Hengel
Ehsan Abbasnejad
MoMe
96
13
0
03 Jul 2024
PLeaS -- Merging Models with Permutations and Least Squares
Anshul Nasery
J. Hayase
Pang Wei Koh
Sewoong Oh
MoMe
79
3
0
02 Jul 2024
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
Yuyan Zhou
Liang Song
Bingning Wang
Weipeng Chen
MoMe
70
23
0
17 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
80
57
0
17 Jun 2024
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
Anke Tang
Li Shen
Yong Luo
Han Hu
Di Lin
Dacheng Tao
ELM
MoMe
VLM
61
25
0
05 Jun 2024
EMR-Merging: Tuning-Free High-Performance Model Merging
Chenyu Huang
Peng Ye
Tao Chen
Tong He
Xiangyu Yue
Wanli Ouyang
MoMe
71
40
0
23 May 2024
Localizing Task Information for Improved Model Merging and Compression
Ke Wang
Nikolaos Dimitriadis
Guillermo Ortiz-Jimenez
Franccois Fleuret
Pascal Frossard
MoMe
55
56
0
13 May 2024
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
81
359
0
06 Apr 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
67
17
0
28 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
122
93
0
20 Mar 2024
Training-Free Pretrained Model Merging
Zhenxing Xu
Ke Yuan
Huiqiong Wang
Yong Wang
Mingli Song
Mingli Song
MoMe
61
16
0
04 Mar 2024
Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translation
Heegon Jin
Seonil Son
Jemin Park
Youngseok Kim
Hyungjong Noh
Yeonsoo Lee
59
2
0
03 Mar 2024
Model Tailor: Mitigating Catastrophic Forgetting in Multi-modal Large Language Models
Didi Zhu
Zhongyi Sun
Zexi Li
Tao Shen
Ke Yan
Shouhong Ding
Kun Kuang
Chao Wu
CLL
KELM
MoMe
87
27
0
19 Feb 2024
End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training
Keitaro Sakamoto
Issei Sato
42
4
0
14 Feb 2024
Representation Surgery for Multi-Task Model Merging
Enneng Yang
Li Shen
Zhenyi Wang
Guibing Guo
Xiaojun Chen
Xingwei Wang
Dacheng Tao
MoMe
76
44
0
05 Feb 2024
Knowledge Fusion of Large Language Models
Fanqi Wan
Xinting Huang
Deng Cai
Xiaojun Quan
Wei Bi
Shuming Shi
MoMe
79
69
0
19 Jan 2024
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
67
63
0
11 Dec 2023
Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion
Anke Tang
Li Shen
Yong Luo
Liang Ding
Han Hu
Bo Du
Dacheng Tao
MoMe
66
22
0
11 Dec 2023
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
79
304
0
06 Nov 2023
Equivariant Deep Weight Space Alignment
Aviv Navon
Aviv Shamsian
Ethan Fetaya
Gal Chechik
Nadav Dym
Haggai Maron
50
22
0
20 Oct 2023
Parameter Efficient Multi-task Model Fusion with Partial Linearization
Anke Tang
Li Shen
Yong Luo
Yibing Zhan
Han Hu
Bo Du
Yixin Chen
Dacheng Tao
MoMe
56
35
0
07 Oct 2023
AdaMerging: Adaptive Model Merging for Multi-Task Learning
Enneng Yang
Zhenyi Wang
Li Shen
Shiwei Liu
Guibing Guo
Xingwei Wang
Dacheng Tao
MoMe
51
111
0
04 Oct 2023
Module-wise Training of Neural Networks via the Minimizing Movement Scheme
Skander Karkar
Bhaskar Sen
Emmanuel de Bezenac
Patrick Gallinari
53
2
0
29 Sep 2023
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo
Qingfeng Sun
Can Xu
Pu Zhao
Jian-Guang Lou
...
Xiubo Geng
Qingwei Lin
Shifeng Chen
Yansong Tang
Dongmei Zhang
LRM
OSLM
160
439
0
18 Aug 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
249
11,636
0
18 Jul 2023
Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity
Zhanpeng Zhou
Yongyi Yang
Xiaojiang Yang
Junchi Yan
Wei Hu
57
31
0
17 Jul 2023
Composing Parameter-Efficient Modules with Arithmetic Operations
Jinghan Zhang
Shiqi Chen
Junteng Liu
Junxian He
KELM
MoMe
57
117
0
26 Jun 2023
TIES-Merging: Resolving Interference When Merging Models
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Joey Tianyi Zhou
MoMe
100
284
0
02 Jun 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
96
119
0
22 May 2023
ZipIt! Merging Models from Different Tasks without Training
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLM
MoMe
71
117
0
04 May 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
964
12,840
0
27 Feb 2023
1
2
Next