ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.03099
  4. Cited By
Language Models are Super Mario: Absorbing Abilities from Homologous
  Models as a Free Lunch

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

6 November 2023
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
    MoMe
ArXivPDFHTML

Papers citing "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch"

50 / 223 papers shown
Title
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Liger: Linearizing Large Language Models to Gated Recurrent Structures
Disen Lan
Weigao Sun
Jiaxi Hu
Jusen Du
Yu-Xi Cheng
69
0
0
03 Mar 2025
DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models
Y. Huang
Peng Ye
Chenyu Huang
Jianjian Cao
Lin Zhang
Baopu Li
Gang Yu
Tao Chen
MoMe
MoE
58
1
0
03 Mar 2025
Med-LEGO: Editing and Adapting toward Generalist Medical Image Diagnosis
Yitao Zhu
Yuan Yin
Jiaming Li
Mengjie Xu
Zihao Zhao
Honglin Xiong
Sheng Wang
Qian Wang
MedIm
75
0
0
03 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian
Xiaoye Qu
Zhenyi Lu
Wei Wei
Sichen Liu
Yu-Xi Cheng
DiffM
VGen
44
0
0
02 Mar 2025
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
Layer-Aware Task Arithmetic: Disentangling Task-Specific and Instruction-Following Knowledge
Yan-Lun Chen
Yi-Ru Wei
Chia-Yi Hsu
Chia-Mu Yu
Chun-ying Huang
Ying-Dar Lin
Yu-Sung Wu
Wei-Bin Lee
MoMe
KELM
56
0
0
27 Feb 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
48
0
0
26 Feb 2025
CAMEx: Curvature-aware Merging of Experts
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
104
2
0
26 Feb 2025
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Mixup Model Merge: Enhancing Model Merging Performance through Randomized Linear Interpolation
Yue Zhou
Yi-Ju Chang
Yuan Wu
MoMe
69
2
0
24 Feb 2025
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Fanhu Zeng
Haiyang Guo
Fei Zhu
Li Shen
Hao Tang
MoMe
54
1
0
24 Feb 2025
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma
Dongrui Liu
Qian Chen
Linfeng Zhang
Jing Shao
MoMe
168
0
0
24 Feb 2025
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Qiuming Zhao
Guangzhi Sun
Chao Zhang
Mingxing Xu
Thomas Fang Zheng
MoMe
VLM
184
0
0
24 Feb 2025
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging
Lin Lu
Zhigang Zuo
Ziji Sheng
Pan Zhou
MoMe
57
0
0
22 Feb 2025
MoMa: A Modular Deep Learning Framework for Material Property Prediction
MoMa: A Modular Deep Learning Framework for Material Property Prediction
Botian Wang
Y. Ouyang
Yaohui Li
Yansen Wang
Haorui Cui
Jianbing Zhang
Xiaonan Wang
Wei-Ying Ma
Hao Zhou
49
0
0
21 Feb 2025
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Sparsity May Be All You Need: Sparse Random Parameter Adaptation
Jesus Rios
Pierre L. Dognin
Ronny Luss
K. Ramamurthy
32
1
0
21 Feb 2025
Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models
Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models
Shuqi Liu
Han Wu
Bowei He
Xiongwei Han
M. Yuan
Linqi Song
MoMe
63
1
0
20 Feb 2025
Scalable Model Merging with Progressive Layer-wise Distillation
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMe
FedML
90
0
0
18 Feb 2025
Exploring Translation Mechanism of Large Language Models
Exploring Translation Mechanism of Large Language Models
Hongbin Zhang
Kehai Chen
Xuefeng Bai
Xiucheng Li
Yang Xiang
Min Zhang
67
1
0
17 Feb 2025
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
Optimal Brain Iterative Merging: Mitigating Interference in LLM Merging
Zhixiang Wang
Zhenyu Mao
Yixuan Qiao
Yunfang Wu
Biye Li
MoMe
73
0
0
17 Feb 2025
Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy
Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy
Zhenyuan Guo
Yi Shi
Wenlong Meng
Chen Gong
Chengkun Wei
Wenzhi Chen
MoMe
66
0
0
17 Feb 2025
SuperMerge: An Approach For Gradient-Based Model Merging
SuperMerge: An Approach For Gradient-Based Model Merging
Haoyu Yang
Zheng Zhang
Saket Sathe
MoMe
127
0
0
17 Feb 2025
Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
Primus: A Pioneering Collection of Open-Source Datasets for Cybersecurity LLM Training
Yao-Ching Yu
Tsun-Han Chiang
Cheng-Wei Tsai
Chien-Ming Huang
Wen-Kwang Tsao
62
6
0
16 Feb 2025
Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
Soteria: Language-Specific Functional Parameter Steering for Multilingual Safety Alignment
Somnath Banerjee
Sayan Layek
Pratyush Chatterjee
Animesh Mukherjee
Rima Hazra
LLMSV
76
0
0
16 Feb 2025
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Guofu Xie
Xiao Zhang
Ting Yao
Yunsheng Shi
MoMe
63
1
0
15 Feb 2025
1bit-Merging: Dynamic Quantized Merging for Large Language Models
1bit-Merging: Dynamic Quantized Merging for Large Language Models
Shuqi Liu
Han Wu
Bowei He
Zehua Liu
Xiongwei Han
M. Yuan
Linqi Song
MoMe
MQ
74
1
0
15 Feb 2025
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
LoRE-Merging: Exploring Low-Rank Estimation For Large Language Model Merging
Zehua Liu
Han Wu
Yuxuan Yao
Ruifeng She
Xiongwei Han
Tao Zhong
M. Yuan
MoMe
52
1
0
15 Feb 2025
Superpose Singular Features for Model Merging
Superpose Singular Features for Model Merging
Haiquan Qiu
You Wu
Quanming Yao
MoMe
48
0
0
15 Feb 2025
Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
Atsushi Nitanda
Anzelle Lee
Damian Tan Xing Kai
Mizuki Sakaguchi
Taiji Suzuki
AI4CE
67
1
0
09 Feb 2025
Mol-MoE: Training Preference-Guided Routers for Molecule Generation
Diego Calanzone
P. DÓro
Pierre-Luc Bacon
52
0
0
08 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
111
0
0
03 Feb 2025
Sparse High Rank Adapters
Sparse High Rank Adapters
K. Bhardwaj
N. Pandey
Sweta Priyadarshi
Viswanath Ganapathy
Rafael Esteves
...
P. Whatmough
Risheek Garrepalli
M. V. Baalen
Harris Teague
Markus Nagel
MQ
43
4
0
28 Jan 2025
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
116
100
0
28 Jan 2025
Multi-Task Model Merging via Adaptive Weight Disentanglement
Multi-Task Model Merging via Adaptive Weight Disentanglement
Feng Xiong
Runxi Cheng
Wang Chen
Zhanqiu Zhang
Yiwen Guo
Chun Yuan
Ruifeng Xu
MoMe
102
4
0
10 Jan 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
65
19
0
08 Jan 2025
Training-free Heterogeneous Model Merging
Zhengqi Xu
Han Zheng
Jie Song
Li Sun
Mingli Song
MoMe
72
1
0
03 Jan 2025
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Cut the Deadwood Out: Post-Training Model Purification with Selective Module Substitution
Yao Tong
Weijun Li
Xuanli He
Haolan Zhan
Qiongkai Xu
AAML
38
1
0
31 Dec 2024
ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation
ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation
Chenhui Deng
Yunsheng Bai
Haoxing Ren
39
1
0
31 Dec 2024
Parameter-Efficient Interventions for Enhanced Model Merging
Parameter-Efficient Interventions for Enhanced Model Merging
Marcin Osial
Daniel Marczak
Bartosz Zieliñski
MoMe
84
1
0
22 Dec 2024
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation
Runtao Liu
Chen I Chieh
Jindong Gu
Jipeng Zhang
Renjie Pi
Qifeng Chen
Philip Torr
Ashkan Khakzar
Fabio Pizzati
EGVM
111
0
0
13 Dec 2024
How to Merge Your Multimodal Models Over Time?
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
103
3
0
09 Dec 2024
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free
  Fusion
Enhancing Perception Capabilities of Multimodal LLMs with Training-Free Fusion
Zhuokun Chen
Jinwu Hu
Zeshuai Deng
Yufeng Wang
Bohan Zhuang
Mingkui Tan
71
0
0
02 Dec 2024
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Task Arithmetic Through The Lens Of One-Shot Federated Learning
Zhixu Tao
I. Mason
Sanjeev R. Kulkarni
Xavier Boix
MoMe
FedML
84
3
0
27 Nov 2024
Task Singular Vectors: Reducing Task Interference in Model Merging
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo
Donato Crisostomi
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Emanuele Rodolà
MoMe
87
9
0
26 Nov 2024
Beyond Task Vectors: Selective Task Arithmetic Based on Importance
  Metrics
Beyond Task Vectors: Selective Task Arithmetic Based on Importance Metrics
Tian Bowen
Lai Songning
Wu Jiemin
Shuai Zhihao
Ge Shiming
Yue Yutao
MoMe
70
4
0
25 Nov 2024
FREE-Merging: Fourier Transform for Efficient Model Merging
FREE-Merging: Fourier Transform for Efficient Model Merging
Shenghe Zheng
Hongzhi Wang
MoMe
77
1
0
25 Nov 2024
Sparse Orthogonal Parameters Tuning for Continual Learning
Sparse Orthogonal Parameters Tuning for Continual Learning
Kun-Peng Ning
Hai-Jian Ke
Yu-Yang Liu
Jia-Yu Yao
Yong-Hong Tian
Li Yuan
CLL
30
1
0
05 Nov 2024
Collective Model Intelligence Requires Compatible Specialization
Collective Model Intelligence Requires Compatible Specialization
Jyothish Pari
Samy Jelassi
Pulkit Agrawal
MoMe
51
1
0
04 Nov 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMe
VLM
31
0
0
01 Nov 2024
Efficient and Effective Weight-Ensembling Mixture of Experts for
  Multi-Task Model Merging
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging
Li Shen
Anke Tang
Enneng Yang
G. Guo
Yong Luo
Lefei Zhang
Xiaochun Cao
Bo Du
Dacheng Tao
MoMe
37
6
0
29 Oct 2024
Model merging with SVD to tie the Knots
Model merging with SVD to tie the Knots
George Stoica
Pratik Ramesh
B. Ecsedi
Leshem Choshen
Judy Hoffman
MoMe
39
9
0
25 Oct 2024
Can Large Language Models Invent Algorithms to Improve Themselves?
Can Large Language Models Invent Algorithms to Improve Themselves?
Yoichi Ishibashi
Taro Yano
Masafumi Oyamada
AIFin
LRM
34
1
0
21 Oct 2024
Previous
12345
Next