ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.03099
  4. Cited By
Language Models are Super Mario: Absorbing Abilities from Homologous
  Models as a Free Lunch

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

6 November 2023
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
    MoMe
ArXivPDFHTML

Papers citing "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch"

50 / 223 papers shown
Title
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Meng Wang
Yunzhi Yao
Ziwen Xu
Shuofei Qiao
Shumin Deng
...
Yong-jia Jiang
Pengjun Xie
Fei Huang
Huajun Chen
Ningyu Zhang
55
28
0
22 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current
  Status, Challenges, and Perspectives
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
34
23
0
20 Jul 2024
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces
Zehan Wang
Ziang Zhang
Hang Zhang
Luping Liu
Rongjie Huang
Xize Cheng
Hengshuang Zhao
Zhou Zhao
46
9
0
16 Jul 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model
  Merging
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMe
CLL
KELM
42
14
0
11 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
52
18
0
08 Jul 2024
Unlocking the Potential of Model Merging for Low-Resource Languages
Unlocking the Potential of Model Merging for Low-Resource Languages
Mingxu Tao
Chen Zhang
Quzhe Huang
Tianyao Ma
Songfang Huang
Dongyan Zhao
Yansong Feng
CLL
MoMe
30
3
0
04 Jul 2024
PLeaS -- Merging Models with Permutations and Least Squares
PLeaS -- Merging Models with Permutations and Least Squares
Anshul Nasery
J. Hayase
Pang Wei Koh
Sewoong Oh
MoMe
51
3
0
02 Jul 2024
It's Morphing Time: Unleashing the Potential of Multiple LLMs via
  Multi-objective Optimization
It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization
Bingdong Li
Zixiang Di
Yanting Yang
Hong Qian
Peng Yang
Hao Hao
Ke Tang
Aimin Zhou
MoMe
19
5
0
29 Jun 2024
Sequential Editing for Lifelong Training of Speech Recognition Models
Sequential Editing for Lifelong Training of Speech Recognition Models
Devang Kulshreshtha
Saket Dingliwal
Brady C. Houston
Nikolaos Pappas
S. Ronanki
KELM
CLL
34
1
0
25 Jun 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
45
13
0
24 Jun 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
62
14
0
24 Jun 2024
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training
  and Model Merging: A Comprehensive Evaluation
Domain Adaptation of Llama3-70B-Instruct through Continual Pre-Training and Model Merging: A Comprehensive Evaluation
Shamane Siriwardhana
Mark McQuade
Thomas Gauthier
Lucas Atkins
Fernando Fernandes Neto
...
Anneketh Vij
Tyler Odenthal
Charles Goddard
Mary MacCarthy
Jacob Solawetz
CLL
MoMe
ALM
39
8
0
21 Jun 2024
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Hasan Hammoud
Umberto Michieli
Fabio Pizzati
Philip Torr
Adel Bibi
Guohao Li
Mete Ozay
MoMe
31
15
0
20 Jun 2024
Breaking the Ceiling of the LLM Community by Treating Token Generation
  as a Classification for Ensembling
Breaking the Ceiling of the LLM Community by Treating Token Generation as a Classification for Ensembling
Yao-Ching Yu
Chun-Chih Kuo
Ziqi Ye
Yu-Cheng Chang
Yueh-Se Li
56
9
0
18 Jun 2024
Knowledge Fusion By Evolving Weights of Language Models
Knowledge Fusion By Evolving Weights of Language Models
Guodong Du
Jing Li
Hanting Liu
Runhua Jiang
Shuyang Yu
Yifei Guo
S. Goh
Ho-Kin Tang
MoMe
44
8
0
18 Jun 2024
Self-MoE: Towards Compositional Large Language Models with
  Self-Specialized Experts
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Junmo Kang
Leonid Karlinsky
Hongyin Luo
Zhen Wang
Jacob A. Hansen
James Glass
David D. Cox
Yikang Shen
Rogerio Feris
Alan Ritter
MoMe
MoE
42
8
0
17 Jun 2024
DELLA-Merging: Reducing Interference in Model Merging through
  Magnitude-Based Sampling
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Pala Tej Deep
Rishabh Bhardwaj
Soujanya Poria
MoMe
38
24
0
17 Jun 2024
MetaGPT: Merging Large Language Models Using Model Exclusive Task
  Arithmetic
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
Yuyan Zhou
Liang Song
Bingning Wang
Weipeng Chen
MoMe
30
16
0
17 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
50
48
0
17 Jun 2024
Towards Efficient Pareto Set Approximation via Mixture of Experts Based
  Model Fusion
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion
Anke Tang
Li Shen
Yong Luo
Shiwei Liu
Han Hu
Bo Du
MoMe
31
6
0
14 Jun 2024
A Survey on Large Language Models from General Purpose to Medical
  Applications: Datasets, Methodologies, and Evaluations
A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations
Jinqiang Wang
Huansheng Ning
Yi Peng
Qikai Wei
Daniel Tesfai
Wenwei Mao
Tao Zhu
Runhe Huang
LM&MA
AI4MH
ELM
44
5
0
14 Jun 2024
ME-Switch: A Memory-Efficient Expert Switching Framework for Large
  Language Models
ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models
Jing Liu
Ruihao Gong
Mingyang Zhang
Yefei He
Jianfei Cai
Bohan Zhuang
MoE
45
0
0
13 Jun 2024
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for
  Large Language Models
Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models
Bowen Ping
Shuo Wang
Hanqing Wang
Xu Han
Yuzhuang Xu
Yukun Yan
Yun Chen
Baobao Chang
Zhiyuan Liu
Maosong Sun
MQ
48
5
0
13 Jun 2024
Merging Improves Self-Critique Against Jailbreak Attacks
Merging Improves Self-Critique Against Jailbreak Attacks
Victor Gallego
AAML
MoMe
44
3
0
11 Jun 2024
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Lu Li
Tianze Zhang
Zhiqi Bu
Suyuchen Wang
Huan He
Jie Fu
Yonghui Wu
Jiang Bian
Yong Chen
Yoshua Bengio
FedML
MoMe
100
3
0
11 Jun 2024
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference
  Alignment Techniques
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
Megh Thakkar
Quentin Fournier
Matthew D Riemer
Pin-Yu Chen
Amal Zouaq
Payel Das
Sarath Chandar
ALM
42
8
0
07 Jun 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning
Yibo Yang
Xiaojie Li
Zhongzhu Zhou
Shuaiwen Leon Song
Jianlong Wu
Liqiang Nie
Guohao Li
45
6
0
07 Jun 2024
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
Anke Tang
Li Shen
Yong Luo
Han Hu
Bo Du
Dacheng Tao
ELM
MoMe
VLM
44
22
0
05 Jun 2024
HPE-CogVLM: New Head Pose Grounding Task Exploration on Vision Language
  Model
HPE-CogVLM: New Head Pose Grounding Task Exploration on Vision Language Model
Yu Tian
Tianqi Shao
Tsukasa Demizu
Xuyang Wu
Hsin-Tai Wu
26
3
0
04 Jun 2024
Pretrained Hybrids with MAD Skills
Pretrained Hybrids with MAD Skills
Nicholas Roberts
Samuel Guo
Zhiqi Gao
Satya Sai Srinath Namburi
Sonia Cromp
Chengjun Wu
Chengyu Duan
Frederic Sala
Mamba
42
0
0
02 Jun 2024
TAIA: Large Language Models are Out-of-Distribution Data Learners
TAIA: Large Language Models are Out-of-Distribution Data Learners
Shuyang Jiang
Yusheng Liao
Ya Zhang
Yu Wang
Yanfeng Wang
29
3
0
30 May 2024
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in
  Alignment
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
Keming Lu
Bowen Yu
Fei Huang
Yang Fan
Runji Lin
Chang Zhou
MoMe
32
18
0
28 May 2024
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via
  Selective Tensor Freezing
FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing
Kai Huang
Wei Gao
42
2
0
24 May 2024
MiniCache: KV Cache Compression in Depth Dimension for Large Language
  Models
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
Akide Liu
Jing Liu
Zizheng Pan
Yefei He
Gholamreza Haffari
Bohan Zhuang
MQ
35
30
0
23 May 2024
EMR-Merging: Tuning-Free High-Performance Model Merging
EMR-Merging: Tuning-Free High-Performance Model Merging
Chenyu Huang
Peng Ye
Tao Chen
Tong He
Xiangyu Yue
Wanli Ouyang
MoMe
46
29
0
23 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
  Alignment Tax Reduction
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
58
13
0
22 May 2024
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models
Jingwei Xu
Junyu Lai
Yunpeng Huang
MoE
MoMe
38
8
0
19 May 2024
A safety realignment framework via subspace-oriented model fusion for
  large language models
A safety realignment framework via subspace-oriented model fusion for large language models
Xin Yi
Shunfan Zheng
Linlin Wang
Xiaoling Wang
Liang He
60
21
0
15 May 2024
Aloe: A Family of Fine-tuned Open Healthcare LLMs
Aloe: A Family of Fine-tuned Open Healthcare LLMs
Ashwin Kumar Gururajan
Enrique Lopez-Cuena
Jordi Bayarri-Planas
Adrián Tormos
Daniel Hinjos
...
Lucia Urcelay-Ganzabal
Marta Gonzalez-Mallo
Sergio Alvarez-Napagao
Eduard Ayguadé-Parra
Ulises Cortés Dario Garcia-Gasulla
ELM
LM&MA
35
14
0
03 May 2024
Prometheus 2: An Open Source Language Model Specialized in Evaluating
  Other Language Models
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models
Seungone Kim
Juyoung Suk
Shayne Longpre
Bill Yuchen Lin
Jamin Shin
Sean Welleck
Graham Neubig
Moontae Lee
Kyungjae Lee
Minjoon Seo
MoMe
ALM
ELM
51
171
0
02 May 2024
HFT: Half Fine-Tuning for Large Language Models
HFT: Half Fine-Tuning for Large Language Models
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Weiran Xu
Yu Sun
Hua Wu
CLL
45
4
0
29 Apr 2024
A Survey on Self-Evolution of Large Language Models
A Survey on Self-Evolution of Large Language Models
Zhengwei Tao
Ting-En Lin
Xiancai Chen
Hangyu Li
Yuchuan Wu
Yongbin Li
Zhi Jin
Fei Huang
Dacheng Tao
Jingren Zhou
LRM
LM&Ro
57
22
0
22 Apr 2024
In-Context Learning State Vector with Inner and Momentum Optimization
In-Context Learning State Vector with Inner and Momentum Optimization
Dongfang Li
Zhenyu Liu
Xinshuo Hu
Zetian Sun
Baotian Hu
Min Zhang
40
5
0
17 Apr 2024
Balancing Speciality and Versatility: a Coarse to Fine Framework for
  Supervised Fine-tuning Large Language Model
Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model
Hengyuan Zhang
Yanru Wu
Dawei Li
Zacc Yang
Rui Zhao
Yong Jiang
Fei Tan
ALM
35
1
0
16 Apr 2024
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical
  Question Answering
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering
Inigo Alonso
Maite Oronoz
Rodrigo Agerri
AI4MH
LM&MA
ELM
52
16
1
08 Apr 2024
Have You Merged My Model? On The Robustness of Large Language Model IP
  Protection Methods Against Model Merging
Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging
Tianshuo Cong
Delong Ran
Zesen Liu
Xinlei He
Jinyuan Liu
Yichen Gong
Qi Li
Anyu Wang
Xiaoyun Wang
MoMe
46
7
0
08 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
Yu Wang
MoMe
57
4
0
02 Apr 2024
Enhancing Content-based Recommendation via Large Language Model
Enhancing Content-based Recommendation via Large Language Model
Wentao Xu
Qianqian Xie
Shuo Yang
Jiangxia Cao
Shuchao Pang
27
5
0
30 Mar 2024
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Checkpoint Merging via Bayesian Optimization in LLM Pretraining
Deyuan Liu
Zecheng Wang
Bingning Wang
Weipeng Chen
Chunshan Li
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
MoMe
52
15
0
28 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMe
KELM
90
80
0
20 Mar 2024
Previous
12345
Next