ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.03099
  4. Cited By
Language Models are Super Mario: Absorbing Abilities from Homologous
  Models as a Free Lunch

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

6 November 2023
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
    MoMe
ArXivPDFHTML

Papers citing "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch"

50 / 223 papers shown
Title
Scalable Strategies for Continual Learning with Replay
Scalable Strategies for Continual Learning with Replay
Truman Hickok
CLL
2
0
0
18 May 2025
MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
MINGLE: Mixtures of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging
Zihuan Qiu
Yi Xu
Chiyuan He
Fanman Meng
Linfeng Xu
Qi Wu
Hongliang Li
CLL
MoMe
14
0
0
17 May 2025
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
MergeBench: A Benchmark for Merging Domain-Specialized LLMs
Yifei He
Siqi Zeng
Yuzheng Hu
Rui Yang
Tong Zhang
Han Zhao
MoMe
ALM
24
0
0
16 May 2025
Mergenetic: a Simple Evolutionary Model Merging Library
Mergenetic: a Simple Evolutionary Model Merging Library
Adrian Robert Minut
Tommaso Mencattini
Andrea Santilli
Donato Crisostomi
Emanuele Rodolà
MoMe
24
0
0
16 May 2025
A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment
A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment
Jean-Philippe Corbeil
Amin Dada
Jean-Michel Attendu
Asma Ben Abacha
Alessandro Sordoni
Lucas Caccia
François Beaulieu
Thomas Lin
Jens Kleesiek
Paul Vozila
LM&MA
17
0
0
15 May 2025
RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models
RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models
Zhenhua Xu
Zhebo Wang
Maike Li
Wenpeng Xing
Chunqiang Hu
Chen Zhi
Meng Han
AAML
36
0
0
08 May 2025
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
Jinpeng Chen
Runmin Cong
Yuzhi Zhao
Hongzheng Yang
Guangneng Hu
H. Ip
Sam Kwong
CLL
KELM
83
0
0
05 May 2025
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Marco Braga
Pranav Kasela
Alessandro Raganato
G. Pasi
RALM
69
0
0
01 May 2025
Dynamic Fisher-weighted Model Merging via Bayesian Optimization
Dynamic Fisher-weighted Model Merging via Bayesian Optimization
Sanwoo Lee
Jiahao Liu
Qifan Wang
Jiadong Wang
Xunliang Cai
Yunfang Wu
MoMe
172
0
0
26 Apr 2025
Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost
ParamΔΔΔ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost
Sheng Cao
Mingrui Wu
Karthik Prasad
Yuandong Tian
Zechun Liu
MoMe
82
0
0
23 Apr 2025
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Ziwen Xu
Shuxun Wang
Kewei Xu
Haoming Xu
Mengru Wang
Xinle Deng
Yunzhi Yao
Guozhou Zheng
H. Chen
Ningyu Zhang
KELM
LLMSV
181
0
0
21 Apr 2025
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning
Yeoreum Lee
Jinwook Jung
Sungyong Baik
MoMe
45
0
0
20 Apr 2025
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
Yan Yang
Yixia Li
Hongru Wang
Xuetao Wei
Jianqiao Yu
Yun-Nung Chen
Guanhua Chen
MoMe
28
0
0
17 Apr 2025
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers
Hongkang Li
Yihua Zhang
Shuai Zhang
Hao Wu
Sijia Liu
Pin-Yu Chen
MoMe
69
4
0
15 Apr 2025
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
54
2
0
15 Apr 2025
Reduction of Supervision for Biomedical Knowledge Discovery
Reduction of Supervision for Biomedical Knowledge Discovery
Christos Theodoropoulos
Andrei Catalin Coman
James Henderson
Marie-Francine Moens
27
0
0
13 Apr 2025
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
LoRI: Reducing Cross-Task Interference in Multi-Task Low-Rank Adaptation
Juzheng Zhang
Jiacheng You
Ashwinee Panda
Tom Goldstein
MoMe
53
1
0
10 Apr 2025
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Defending Deep Neural Networks against Backdoor Attacks via Module Switching
Weijun Li
Ansh Arora
Xuanli He
Mark Dras
Qiongkai Xu
AAML
MoMe
53
0
0
08 Apr 2025
SEA-LION: Southeast Asian Languages in One Network
SEA-LION: Southeast Asian Languages in One Network
Raymond Ng
Thanh Ngan Nguyen
Yuli Huang
Ngee Chia Tai
Wai Yi Leong
...
David Ong Tat-Wee
B. Liu
William-Chandra Tjhi
Min Zhang
Leslie Teo
38
12
0
08 Apr 2025
MASS: MoErging through Adaptive Subspace Selection
MASS: MoErging through Adaptive Subspace Selection
Donato Crisostomi
Alessandro Zirilli
Antonio Andrea Gargiulo
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Iacopo Masi
Emanuele Rodolà
MoMe
40
0
0
06 Apr 2025
Exact Unlearning of Finetuning Data via Model Merging at Scale
Exact Unlearning of Finetuning Data via Model Merging at Scale
Kevin Kuo
Amrith Rajagopal Setlur
Kartik Srinivas
Aditi Raghunathan
Virginia Smith
MoMe
CLL
MU
45
0
0
06 Apr 2025
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Efficient Model Editing with Task-Localized Sparse Fine-tuning
Leonardo Iurada
Marco Ciccone
Tatiana Tommasi
KELM
MoMe
56
2
0
03 Apr 2025
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization
Yiyang Du
Xiaochen Wang
C. Chen
Jiabo Ye
Yiru Wang
...
J.N. Zhang
Fei Huang
Zhifang Sui
Maosong Sun
Yi Liu
MoMe
57
0
0
31 Mar 2025
Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach
Enhancing Image Resolution of Solar Magnetograms: A Latent Diffusion Model Approach
Francesco P. Ramunno
Paolo Massa
Vitaliy Kinakh
Brandon Panos
A. Csillaghy
S. Voloshynovskiy
DiffM
53
0
0
31 Mar 2025
Breach in the Shield: Unveiling the Vulnerabilities of Large Language Models
Breach in the Shield: Unveiling the Vulnerabilities of Large Language Models
Runpeng Dai
Run Yang
Fan Zhou
Hongtu Zhu
31
0
0
28 Mar 2025
AdaRank: Adaptive Rank Pruning for Enhanced Model Merging
AdaRank: Adaptive Rank Pruning for Enhanced Model Merging
Chanhyuk Lee
Jiho Choi
Chanryeol Lee
Donggyun Kim
Seunghoon Hong
MoMe
55
0
0
28 Mar 2025
Model Assembly Learning with Heterogeneous Layer Weight Merging
Model Assembly Learning with Heterogeneous Layer Weight Merging
Yi-Kai Zhang
Jin Wang
Xu-Xiang Zhong
De-Chuan Zhan
Han-Jia Ye
MoMe
54
0
0
27 Mar 2025
Reinforced Model Merging
Reinforced Model Merging
J. N. Han
Jingwen Ye
Shunyu Liu
Haofei Zhang
Jie Song
Zunlei Feng
Mingli Song
MoMe
55
0
0
27 Mar 2025
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Haoming Xu
Shuxun Wang
Yanqiu Zhao
Yi Zhong
Ziyan Jiang
Ningyuan Zhao
Shumin Deng
Hongyu Chen
N. Zhang
MoMe
MU
72
0
0
27 Mar 2025
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging
Han Wu
Yuxuan Yao
Shuqi Liu
Zehua Liu
Xiaojin Fu
Xiongwei Han
Xianrui Li
Hui-Ling Zhen
Tao Zhong
Mingxuan Yuan
MoMe
LRM
78
5
0
26 Mar 2025
Unlocking the Value of Decentralized Data: A Federated Dual Learning Approach for Model Aggregation
Unlocking the Value of Decentralized Data: A Federated Dual Learning Approach for Model Aggregation
Junyi Zhu
Ruicong Yao
Taha Ceritli
Savas Ozkan
Matthew B. Blaschko
Eunchung Noh
Jeongwon Min
Cho Jung Min
Mete Ozay
FedML
103
0
0
26 Mar 2025
Efficient Model Development through Fine-tuning Transfer
Efficient Model Development through Fine-tuning Transfer
Pin-Jie Lin
Rishab Balasubramanian
Fengyuan Liu
Nikhil Kandpal
Tu Vu
64
1
0
25 Mar 2025
LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models
LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models
Jian Liang
Wenke Huang
Guancheng Wan
Qu Yang
Mang Ye
MoMe
CLL
AI4CE
62
1
0
21 Mar 2025
SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging
SafeMERGE: Preserving Safety Alignment in Fine-Tuned Large Language Models via Selective Layer-Wise Model Merging
Aladin Djuhera
S. Kadhe
Farhan Ahmed
Syed Zawad
Holger Boche
MoMe
51
0
0
21 Mar 2025
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
Wei Ruan
Tianze Yang
Yue Zhou
Tianming Liu
Jin Lu
MoMe
93
0
0
13 Mar 2025
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Ensemble Learning for Large Language Models in Text and Code Generation: A Survey
Mari Ashiga
Wei Jie
Fan Wu
Vardan K. Voskanyan
Fateme Dinmohammadi
P. Brookes
Jingzhi Gong
Zheng Wang
44
0
0
13 Mar 2025
Enhanced Continual Learning of Vision-Language Models with Model Fusion
Enhanced Continual Learning of Vision-Language Models with Model Fusion
Haoyuan Gao
Zicong Zhang
Yuqi Wei
Linglan Zhao
Guilin Li
Yuan Li
Linghe Kong
Weiran Huang
CLL
VLM
185
0
0
12 Mar 2025
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
Bo Liu
Yunxiang Li
Yangqiu Song
Hanjing Wang
Linyi Yang
Mark W. Schmidt
Jun Wang
Weinan Zhang
Shuyue Hu
Ying Wen
LLMAG
KELM
LRM
AI4CE
92
6
0
12 Mar 2025
Whoever Started the Interference Should End It: Guiding Data-Free Model Merging via Task Vectors
Runxi Cheng
Feng Xiong
Yongxian Wei
Wanyun Zhu
Chun Yuan
MoMe
68
0
0
11 Mar 2025
Task Vector Quantization for Memory-Efficient Model Merging
Youngeun Kim
Seunghwan Lee
Aecheon Jung
Bogon Ryu
Sungeun Hong
MQ
MoMe
54
0
0
10 Mar 2025
Self-supervised Normality Learning and Divergence Vector-guided Model Merging for Zero-shot Congenital Heart Disease Detection in Fetal Ultrasound Videos
Pramit Saha
Divyanshu Mishra
Netzahualcoyotl Hernandez-Cruz
Olga Patey
A. Papageorghiou
Yuki M. Asano
J. A. Noble
48
0
0
10 Mar 2025
Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Chenyu Huang
Peng Ye
Xinyu Wang
Shenghe Zheng
Biqing Qi
Lei Bai
Wanli Ouyang
Tao Chen
31
0
0
09 Mar 2025
Merge then Realign: Simple and Effective Modality-Incremental Continual Learning for Multimodal LLMs
Dingkun Zhang
Shuhan Qi
Xinyu Xiao
Kehai Chen
Xuan Wang
CLL
MoMe
66
0
0
08 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
54
2
0
08 Mar 2025
Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy
Wei Junhao
Yu Zhe
Sakuma Jun
AAML
MoMe
59
0
0
08 Mar 2025
Fairness-Aware Low-Rank Adaptation Under Demographic Privacy Constraints
Parameswaran Kamalaruban
Mark Anderson
Stuart Burrell
Maeve Madigan
Piotr Skalski
David Sutton
57
0
0
07 Mar 2025
Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model
Wenke Huang
Jian Liang
Xianda Guo
Yiyang Fang
Guancheng Wan
...
Bin Yang
He Li
Jiawei Shao
Mang Ye
Bo Du
OffRL
LRM
MLLM
KELM
VLM
65
1
0
06 Mar 2025
Extrapolation Merging: Keep Improving With Extrapolation and Merging
Yiguan Lin
Bin Xu
Yinghao Li
Yang Gao
MoMe
59
1
0
05 Mar 2025
LEWIS (LayEr WIse Sparsity) -- A Training Free Guided Model Merging Approach
Hetarth Chopra
Vidhi Rambhia
Vikram Adve
MoMe
70
0
0
05 Mar 2025
LoRA-Null: Low-Rank Adaptation via Null Space for Large Language Models
Pengwei Tang
Y. Liu
Dongjie Zhang
Xing Wu
Debing Zhang
62
0
0
04 Mar 2025
12345
Next