Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.03099
Cited By
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
6 November 2023
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch"
50 / 223 papers shown
Title
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Disen Lan
Weigao Sun
Jiaxi Hu
Jusen Du
Yu-Xi Cheng
69
0
0
03 Mar 2025
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Y. Huang
Peng Ye
Chenyu Huang
Jianjian Cao
Lin Zhang
Baopu Li
Gang Yu
Tao Chen
MoMe
MoE
58
1
0
03 Mar 2025
Med-LEGO: Editing and Adapting toward Generalist Medical Image Diagnosis
Yitao Zhu
Yuan Yin
Jiaming Li
Mengjie Xu
Zihao Zhao
Honglin Xiong
Sheng Wang
Qian Wang
MedIm
75
0
0
03 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian
Xiaoye Qu
Zhenyi Lu
Wei Wei
Sichen Liu
Yu-Xi Cheng
DiffM
VGen
44
0
0
02 Mar 2025
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
Yan-Lun Chen
Yi-Ru Wei
Chia-Yi Hsu
Chia-Mu Yu
Chun-ying Huang
Ying-Dar Lin
Yu-Sung Wu
Wei-Bin Lee
MoMe
KELM
56
0
0
27 Feb 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
48
0
0
26 Feb 2025
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
104
2
0
26 Feb 2025
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Yue Zhou
Yi-Ju Chang
Yuan Wu
MoMe
69
2
0
24 Feb 2025
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Fanhu Zeng
Haiyang Guo
Fei Zhu
Li Shen
Hao Tang
MoMe
54
1
0
24 Feb 2025
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma
Dongrui Liu
Qian Chen
Linfeng Zhang
Jing Shao
MoMe
168
0
0
24 Feb 2025
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Qiuming Zhao
Guangzhi Sun
Chao Zhang
Mingxing Xu
Thomas Fang Zheng
MoMe
VLM
184
0
0
24 Feb 2025
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu
Zhigang Zuo
Ziji Sheng
Pan Zhou
MoMe
57
0
0
22 Feb 2025
MoMa: A Modular Deep Learning Framework for Material Property Prediction
Botian Wang
Y. Ouyang
Yaohui Li
Yansen Wang
Haorui Cui
Jianbing Zhang
Xiaonan Wang
Wei-Ying Ma
Hao Zhou
49
0
0
21 Feb 2025
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Jesus Rios
Pierre L. Dognin
Ronny Luss
K. Ramamurthy
32
1
0
21 Feb 2025
Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models
Shuqi Liu
Han Wu
Bowei He
Xiongwei Han
M. Yuan
Linqi Song
MoMe
63
1
0
20 Feb 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
90
0
0
18 Feb 2025
Exploring Translation Mechanism of Large Language Models
Hongbin Zhang
Kehai Chen
Xuefeng Bai
Xiucheng Li
Yang Xiang
Min Zhang
67
1
0
17 Feb 2025
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
Zhixiang Wang
Zhenyu Mao
Yixuan Qiao
Yunfang Wu
Biye Li
MoMe
73
0
0
17 Feb 2025
Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy
Zhenyuan Guo
Yi Shi
Wenlong Meng
Chen Gong
Chengkun Wei
Wenzhi Chen
MoMe
66
0
0
17 Feb 2025
SuperMerge: An Approach For Gradient-Based Model Merging
Haoyu Yang
Zheng Zhang
Saket Sathe
MoMe
127
0
0
17 Feb 2025
Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
Yao-Ching Yu
Tsun-Han Chiang
Cheng-Wei Tsai
Chien-Ming Huang
Wen-Kwang Tsao
62
6
0
16 Feb 2025
Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
Somnath Banerjee
Sayan Layek
Pratyush Chatterjee
Animesh Mukherjee
Rima Hazra
LLMSV
76
0
0
16 Feb 2025
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Guofu Xie
Xiao Zhang
Ting Yao
Yunsheng Shi
MoMe
63
1
0
15 Feb 2025
1bit-Merging: Dynamic Quantized Merging for Large Language Models
Shuqi Liu
Han Wu
Bowei He
Zehua Liu
Xiongwei Han
M. Yuan
Linqi Song
MoMe
MQ
74
1
0
15 Feb 2025
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
Zehua Liu
Han Wu
Yuxuan Yao
Ruifeng She
Xiongwei Han
Tao Zhong
M. Yuan
MoMe
52
1
0
15 Feb 2025
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
48
0
0
15 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
67
1
0
09 Feb 2025
Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Diego Calanzone
P. DÓro
Pierre-Luc Bacon
52
0
0
08 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
111
0
0
03 Feb 2025
Sparse High Rank Adapters
K. Bhardwaj
N. Pandey
Sweta Priyadarshi
Viswanath Ganapathy
Rafael Esteves
...
P. Whatmough
Risheek Garrepalli
M. V. Baalen
Harris Teague
Markus Nagel
MQ
43
4
0
28 Jan 2025
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
116
100
0
28 Jan 2025
Multi-Task Model Merging via Adaptive Weight Disentanglement
Feng Xiong
Runxi Cheng
Wang Chen
Zhanqiu Zhang
Yiwen Guo
Chun Yuan
Ruifeng Xu
MoMe
102
4
0
10 Jan 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
65
19
0
08 Jan 2025
Training-free Heterogeneous Model Merging
Zhengqi Xu
Han Zheng
Jie Song
Li Sun
Mingli Song
MoMe
72
1
0
03 Jan 2025
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Yao Tong
Weijun Li
Xuanli He
Haolan Zhan
Qiongkai Xu
AAML
38
1
0
31 Dec 2024
ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation
Chenhui Deng
Yunsheng Bai
Haoxing Ren
39
1
0
31 Dec 2024
Parameter-Efficient Interventions for Enhanced Model Merging
Marcin Osial
Daniel Marczak
Bartosz Zieliñski
MoMe
84
1
0
22 Dec 2024
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation
Runtao Liu
Chen I Chieh
Jindong Gu
Jipeng Zhang
Renjie Pi
Qifeng Chen
Philip Torr
Ashkan Khakzar
Fabio Pizzati
EGVM
111
0
0
13 Dec 2024
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
103
3
0
09 Dec 2024
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free Fusion
Zhuokun Chen
Jinwu Hu
Zeshuai Deng
Yufeng Wang
Bohan Zhuang
Mingkui Tan
71
0
0
02 Dec 2024
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Zhixu Tao
I. Mason
Sanjeev R. Kulkarni
Xavier Boix
MoMe
FedML
84
3
0
27 Nov 2024
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo
Donato Crisostomi
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Emanuele Rodolà
MoMe
87
9
0
26 Nov 2024
Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics
Tian Bowen
Lai Songning
Wu Jiemin
Shuai Zhihao
Ge Shiming
Yue Yutao
MoMe
70
4
0
25 Nov 2024
FREE-Merging: Fourier Transform for Efficient Model Merging
Shenghe Zheng
Hongzhi Wang
MoMe
77
1
0
25 Nov 2024
Sparse Orthogonal Parameters Tuning for Continual Learning
Kun-Peng Ning
Hai-Jian Ke
Yu-Yang Liu
Jia-Yu Yao
Yong-Hong Tian
Li Yuan
CLL
30
1
0
05 Nov 2024
Collective Model Intelligence Requires Compatible Specialization
Jyothish Pari
Samy Jelassi
Pulkit Agrawal
MoMe
51
1
0
04 Nov 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMe
VLM
31
0
0
01 Nov 2024
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging
Li Shen
Anke Tang
Enneng Yang
G. Guo
Yong Luo
Lefei Zhang
Xiaochun Cao
Bo Du
Dacheng Tao
MoMe
37
6
0
29 Oct 2024
Model merging with SVD to tie the Knots
George Stoica
Pratik Ramesh
B. Ecsedi
Leshem Choshen
Judy Hoffman
MoMe
39
9
0
25 Oct 2024
Can Large Language Models Invent Algorithms to Improve Themselves?
Yoichi Ishibashi
Taro Yano
Masafumi Oyamada
AIFin
LRM
34
1
0
21 Oct 2024
Previous
1
2
3
4
5
Next