ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.00433
  4. Cited By
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

Merging Multi-Task Models via Weight-Ensembling Mixture of Experts

1 February 2024
Anke Tang
Li Shen
Yong Luo
Nan Yin
Lefei Zhang
Dacheng Tao
    MoMe
ArXivPDFHTML

Papers citing "Merging Multi-Task Models via Weight-Ensembling Mixture of Experts"

36 / 36 papers shown
Title
CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging
CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging
Wenju Sun
Qingyong Li
Yangli-ao Geng
Boyang Li
MoMe
34
0
0
11 May 2025
Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
Adaptive Helpfulness-Harmlessness Alignment with Preference Vectors
Ren-Wei Liang
Chin-Ting Hsu
Chan-Hung Yu
Saransh Agrawal
Shih-Cheng Huang
Shang-Tse Chen
Kuan-Hao Huang
Shao-Hua Sun
81
0
0
27 Apr 2025
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Rui Dai
Sile Hu
Xu Shen
Yonggang Zhang
Xinmei Tian
Jieping Ye
MoMe
49
2
0
15 Apr 2025
MASS: MoErging through Adaptive Subspace Selection
MASS: MoErging through Adaptive Subspace Selection
Donato Crisostomi
Alessandro Zirilli
Antonio Andrea Gargiulo
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Iacopo Masi
Emanuele Rodolà
MoMe
40
0
0
06 Apr 2025
AdaRank: Adaptive Rank Pruning for Enhanced Model Merging
AdaRank: Adaptive Rank Pruning for Enhanced Model Merging
Chanhyuk Lee
Jiho Choi
Chanryeol Lee
Donggyun Kim
Seunghoon Hong
MoMe
52
0
0
28 Mar 2025
LeForecast: Enterprise Hybrid Forecast by Time Series Intelligence
LeForecast: Enterprise Hybrid Forecast by Time Series Intelligence
Zheng Tan
Yiwen Nie
Wenfa Wu
Guanyu Zhang
Yanze Liu
...
Chao Yang
Jiaxuan Fan
Yuan He
Hongsheng Qi
Yangzhou Du
AI4TS
42
0
0
27 Mar 2025
Task Vector Quantization for Memory-Efficient Model Merging
Youngeun Kim
Seunghwan Lee
Aecheon Jung
Bogon Ryu
Sungeun Hong
MQ
MoMe
54
0
0
10 Mar 2025
Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform
Chenyu Huang
Peng Ye
Xinyu Wang
Shenghe Zheng
Biqing Qi
Lei Bai
Wanli Ouyang
Tao Chen
31
0
0
09 Mar 2025
Multi-Level Collaboration in Model Merging
Qi Li
Runpeng Yu
Xinchao Wang
MoMe
FedML
93
0
0
03 Mar 2025
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation
Fanhu Zeng
Haiyang Guo
Fei Zhu
Li Shen
Hao Tang
MoMe
54
1
0
24 Feb 2025
1bit-Merging: Dynamic Quantized Merging for Large Language Models
1bit-Merging: Dynamic Quantized Merging for Large Language Models
Shuqi Liu
Han Wu
Bowei He
Zehua Liu
Xiongwei Han
M. Yuan
Linqi Song
MoMe
MQ
66
1
0
15 Feb 2025
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Guofu Xie
Xiao Zhang
Ting Yao
Yunsheng Shi
MoMe
60
1
0
15 Feb 2025
Task Arithmetic in Trust Region: A Training-Free Model Merging Approach to Navigate Knowledge Conflicts
Wenju Sun
Qingyong Li
Wen Wang
Yangli-ao Geng
Boyang Li
41
2
0
28 Jan 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
65
18
0
08 Jan 2025
Collective Model Intelligence Requires Compatible Specialization
Collective Model Intelligence Requires Compatible Specialization
Jyothish Pari
Samy Jelassi
Pulkit Agrawal
MoMe
51
1
0
04 Nov 2024
Efficient and Effective Weight-Ensembling Mixture of Experts for
  Multi-Task Model Merging
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging
Li Shen
Anke Tang
Enneng Yang
G. Guo
Yong Luo
Lefei Zhang
Xiaochun Cao
Bo Du
Dacheng Tao
MoMe
32
5
0
29 Oct 2024
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task
  Learning with Deep Representation Surgery
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery
Enneng Yang
Li Shen
Zhenyi Wang
G. Guo
Xingwei Wang
Xiaocun Cao
Jie Zhang
Dacheng Tao
MoMe
37
4
0
18 Oct 2024
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm
  Intelligence
Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
Shangbin Feng
Zifeng Wang
Yike Wang
Sayna Ebrahimi
Hamid Palangi
...
Nathalie Rauschmayr
Yejin Choi
Yulia Tsvetkov
Chen-Yu Lee
Tomas Pfister
MoMe
35
3
0
15 Oct 2024
Glider: Global and Local Instruction-Driven Expert Router
Glider: Global and Local Instruction-Driven Expert Router
Pingzhi Li
Prateek Yadav
Jaehong Yoon
Jie Peng
Yi-Lin Sung
Joey Tianyi Zhou
Tianlong Chen
MoMe
MoE
30
1
0
09 Oct 2024
MECFormer: Multi-task Whole Slide Image Classification with Expert
  Consultation Network
MECFormer: Multi-task Whole Slide Image Classification with Expert Consultation Network
Doanh C. Bui
Jin Tae Kwak
DiffM
MedIm
23
0
0
06 Oct 2024
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation
Changdae Oh
Yixuan Li
Kyungwoo Song
Sangdoo Yun
Dongyoon Han
OOD
MoMe
45
4
0
03 Oct 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
35
6
0
26 Sep 2024
Entity-Aware Self-Attention and Contextualized GCN for Enhanced Relation
  Extraction in Long Sentences
Entity-Aware Self-Attention and Contextualized GCN for Enhanced Relation Extraction in Long Sentences
Xin Wang
Xinyi Bai
25
0
0
15 Sep 2024
SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And
  Model Merging
SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging
Mohammadreza Pourreza
Ruoxi Sun
Hailong Li
Lesly Miculicich
Tomas Pfister
Sercan Ö. Arik
MoMe
34
5
0
22 Aug 2024
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From
  Pre-Trained Foundation Models
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models
Anke Tang
Li Shen
Yong Luo
Shuai Xie
Han Hu
Lefei Zhang
Bo Du
Dacheng Tao
MoMe
42
4
0
19 Aug 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized
  Experts for Collaborative Learning
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Caccia
Haokun Liu
Tianlong Chen
Joey Tianyi Zhou
Leshem Choshen
Alessandro Sordoni
MoMe
46
21
0
13 Aug 2024
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine
  Learning
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning
Ziyu Zhao
Leilei Gan
Guoyin Wang
Yuwei Hu
Tao Shen
Hongxia Yang
Kun Kuang
Fei Wu
MoE
MoMe
39
11
0
24 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
47
48
0
17 Jun 2024
Towards Efficient Pareto Set Approximation via Mixture of Experts Based
  Model Fusion
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion
Anke Tang
Li Shen
Yong Luo
Shiwei Liu
Han Hu
Bo Du
MoMe
31
6
0
14 Jun 2024
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
Anke Tang
Li Shen
Yong Luo
Han Hu
Bo Du
Dacheng Tao
ELM
MoMe
VLM
44
22
0
05 Jun 2024
DAM: Dynamic Adapter Merging for Continual Video QA Learning
DAM: Dynamic Adapter Merging for Continual Video QA Learning
Feng Cheng
Ziyang Wang
Yi-Lin Sung
Yan-Bo Lin
Mohit Bansal
Gedas Bertasius
CLL
MoMe
31
10
0
13 Mar 2024
$π$-Tuning: Transferring Multimodal Foundation Models with Optimal
  Multi-task Interpolation
πππ-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
Chengyue Wu
Teng Wang
Yixiao Ge
Zeyu Lu
Rui-Zhi Zhou
Ying Shan
Ping Luo
MoMe
82
35
0
27 Apr 2023
Git Re-Basin: Merging Models modulo Permutation Symmetries
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
255
314
0
11 Sep 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
327
0
18 Feb 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
305
7,443
0
11 Nov 2021
Optimizing Mode Connectivity via Neuron Alignment
Optimizing Mode Connectivity via Neuron Alignment
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
MoMe
223
80
0
05 Sep 2020
1