ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.09832
  4. Cited By
Merging Models with Fisher-Weighted Averaging

Merging Models with Fisher-Weighted Averaging

18 November 2021
Michael Matena
Colin Raffel
    FedML
    MoMe
ArXivPDFHTML

Papers citing "Merging Models with Fisher-Weighted Averaging"

50 / 283 papers shown
Title
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Lucas Bandarkar
Benjamin Muller
Pritish Yuvraj
Rui Hou
Nayan Singhal
Hongjiang Lv
Bing-Quan Liu
KELM
LRM
MoMe
52
3
0
02 Oct 2024
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter
  Merging
Mitigating Training Imbalance in LLM Fine-Tuning via Selective Parameter Merging
Yiming Ju
Ziyi Ni
Xingrun Xing
Zhixiong Zeng
hanyu Zhao
Siqi Fan
Zheng Zhang
MoMe
37
2
0
01 Oct 2024
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning
Da-Wei Zhou
Zi-Wen Cai
Han-Jia Ye
Lijun Zhang
De-Chuan Zhan
CLL
AI4CE
76
2
0
01 Oct 2024
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling
  Large Language Models
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models
Shuhao Chen
Weisen Jiang
Baijiong Lin
James T. Kwok
Yu Zhang
RALM
MQ
40
5
0
30 Sep 2024
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models
Yu Zhou
Xingyu Wu
Jibin Wu
Liang Feng
Kay Chen Tan
MoMe
61
0
0
27 Sep 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
35
6
0
26 Sep 2024
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to
  Extremes Through Rank-Wise Clustering
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering
Ziyu Zhao
Tao Shen
Didi Zhu
Zexi Li
Jing Su
Xuwu Wang
Kun Kuang
Fei Wu
MoMe
36
6
0
24 Sep 2024
Layer-wise Model Merging for Unsupervised Domain Adaptation in
  Segmentation Tasks
Layer-wise Model Merging for Unsupervised Domain Adaptation in Segmentation Tasks
Roberto Alcover-Couso
Juan C. Sanmiguel
Marcos Escudero-Viñolo
Jose M. Martínez
FedML
MoMe
28
1
0
24 Sep 2024
Towards understanding evolution of science through language model series
Towards understanding evolution of science through language model series
Junjie Dong
Zhuoqi Lyu
Qing Ke
AI4TS
35
0
0
15 Sep 2024
Erasure Coded Neural Network Inference via Fisher Averaging
Erasure Coded Neural Network Inference via Fisher Averaging
Divyansh Jhunjhunwala
Neharika Jali
Gauri Joshi
Shiqiang Wang
MoMe
FedML
31
1
0
02 Sep 2024
Improving the Classification Effect of Clinical Images of Diseases for
  Multi-Source Privacy Protection
Improving the Classification Effect of Clinical Images of Diseases for Multi-Source Privacy Protection
Tian Bowen
Xu Zhengyang
Yin Zhihao
Wang Jingying
Yue Yutao
FedML
42
0
0
23 Aug 2024
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From
  Pre-Trained Foundation Models
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models
Anke Tang
Li Shen
Yong Luo
Shuai Xie
Han Hu
Lefei Zhang
Bo Du
Dacheng Tao
MoMe
42
4
0
19 Aug 2024
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in
  Code LLMs for Automated Program Repair
MergeRepair: An Exploratory Study on Merging Task-Specific Adapters in Code LLMs for Automated Program Repair
Meghdad Dehghan
Jie JW Wu
Fatemeh H. Fard
Ali Ouni
MoMe
50
2
0
18 Aug 2024
Activated Parameter Locating via Causal Intervention for Model Merging
Activated Parameter Locating via Causal Intervention for Model Merging
Fanshuang Kong
Richong Zhang
Ziqiao Wang
MoMe
16
1
0
18 Aug 2024
FuseChat: Knowledge Fusion of Chat Models
FuseChat: Knowledge Fusion of Chat Models
Fanqi Wan
Longguang Zhong
Ziyi Yang
Ruijun Chen
Xiaojun Quan
ALM
KELM
MoMe
32
23
0
15 Aug 2024
UNIC: Universal Classification Models via Multi-teacher Distillation
UNIC: Universal Classification Models via Multi-teacher Distillation
Mert Bulent Sariyildiz
Philippe Weinzaepfel
Thomas Lucas
Diane Larlus
Yannis Kalantidis
34
6
0
09 Aug 2024
ProFuser: Progressive Fusion of Large Language Models
ProFuser: Progressive Fusion of Large Language Models
Tianyuan Shi
Fanqi Wan
Canbin Huang
Xiaojun Quan
Chenliang Li
Ming Yan
Ji Zhang
MoMe
30
2
0
09 Aug 2024
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language
  Models via Weight Disentanglement
Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement
Le Yu
Bowen Yu
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
32
5
0
06 Aug 2024
Task Prompt Vectors: Effective Initialization through Multi-Task
  Soft-Prompt Transfer
Task Prompt Vectors: Effective Initialization through Multi-Task Soft-Prompt Transfer
Wei Chen
Long Chen
Ivan Srba
Yu Wu
MoMe
VLM
36
3
0
02 Aug 2024
Machine Unlearning in Generative AI: A Survey
Machine Unlearning in Generative AI: A Survey
Zheyuan Liu
Guangyao Dou
Zhaoxuan Tan
Yijun Tian
Meng Jiang
MU
31
14
0
30 Jul 2024
Computer Audition: From Task-Specific Machine Learning to Foundation
  Models
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Andreas Triantafyllopoulos
Iosif Tsangko
Alexander Gebhard
A. Mesaros
Tuomas Virtanen
Björn Schuller
45
4
0
22 Jul 2024
Recent Advances in Generative AI and Large Language Models: Current
  Status, Challenges, and Perspectives
Recent Advances in Generative AI and Large Language Models: Current Status, Challenges, and Perspectives
D. Hagos
Rick Battle
Danda B. Rawat
LM&MA
OffRL
31
22
0
20 Jul 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model
  Merging
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMe
CLL
KELM
42
13
0
11 Jul 2024
Foundation Model Engineering: Engineering Foundation Models Just as
  Engineering Software
Foundation Model Engineering: Engineering Foundation Models Just as Engineering Software
Dezhi Ran
Mengzhou Wu
Wei Yang
Tao Xie
AI4CE
36
1
0
11 Jul 2024
MagMax: Leveraging Model Merging for Seamless Continual Learning
MagMax: Leveraging Model Merging for Seamless Continual Learning
Daniel Marczak
Bartłomiej Twardowski
Tomasz Trzciñski
Sebastian Cygert
MoMe
CLL
53
18
0
08 Jul 2024
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in
  the Era of Large Language Models
Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models
Jinliang Lu
Ziliang Pang
Min Xiao
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
49
18
0
08 Jul 2024
Unlocking the Potential of Model Merging for Low-Resource Languages
Unlocking the Potential of Model Merging for Low-Resource Languages
Mingxu Tao
Chen Zhang
Quzhe Huang
Tianyao Ma
Songfang Huang
Dongyan Zhao
Yansong Feng
CLL
MoMe
27
3
0
04 Jul 2024
PLeaS -- Merging Models with Permutations and Least Squares
PLeaS -- Merging Models with Permutations and Least Squares
Anshul Nasery
J. Hayase
Pang Wei Koh
Sewoong Oh
MoMe
51
3
0
02 Jul 2024
DogeRM: Equipping Reward Models with Domain Knowledge through Model
  Merging
DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging
Tzu-Han Lin
Chen An Li
Hung-yi Lee
Yun-Nung Chen
VLM
ALM
26
4
0
01 Jul 2024
It's Morphing Time: Unleashing the Potential of Multiple LLMs via
  Multi-objective Optimization
It's Morphing Time: Unleashing the Potential of Multiple LLMs via Multi-objective Optimization
Bingdong Li
Zixiang Di
Yanting Yang
Hong Qian
Peng Yang
Hao Hao
Ke Tang
Aimin Zhou
MoMe
19
5
0
29 Jun 2024
Enhancing Accuracy and Parameter-Efficiency of Neural Representations
  for Network Parameterization
Enhancing Accuracy and Parameter-Efficiency of Neural Representations for Network Parameterization
Hongjun Choi
Jayaraman J. Thiagarajan
Ruben Glatt
Shusen Liu
43
0
0
29 Jun 2024
Sequential Editing for Lifelong Training of Speech Recognition Models
Sequential Editing for Lifelong Training of Speech Recognition Models
Devang Kulshreshtha
Saket Dingliwal
Brady C. Houston
Nikolaos Pappas
S. Ronanki
KELM
CLL
29
1
0
25 Jun 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
45
13
0
24 Jun 2024
DEM: Distribution Edited Model for Training with Mixed Data
  Distributions
DEM: Distribution Edited Model for Training with Mixed Data Distributions
Dhananjay Ram
Aditya Rawal
Momchil Hardalov
Nikolaos Pappas
Sheng Zha
MoMe
34
1
0
21 Jun 2024
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection
Zhuoxiao Chen
Junjie Meng
Mahsa Baktashmotlagh
Yonggang Zhang
Zi Huang
Yadan Luo
80
1
0
21 Jun 2024
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Model Merging and Safety Alignment: One Bad Model Spoils the Bunch
Hasan Hammoud
Umberto Michieli
Fabio Pizzati
Philip H. S. Torr
Adel Bibi
Guohao Li
Mete Ozay
MoMe
31
15
0
20 Jun 2024
Knowledge Fusion By Evolving Weights of Language Models
Knowledge Fusion By Evolving Weights of Language Models
Guodong Du
Jing Li
Hanting Liu
Runhua Jiang
Shuyang Yu
Yifei Guo
S. Goh
Ho-Kin Tang
MoMe
44
8
0
18 Jun 2024
Self-MoE: Towards Compositional Large Language Models with
  Self-Specialized Experts
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts
Junmo Kang
Leonid Karlinsky
Hongyin Luo
Zhen Wang
Jacob A. Hansen
James Glass
David D. Cox
Rameswar Panda
Rogerio Feris
Alan Ritter
MoMe
MoE
36
8
0
17 Jun 2024
Safety Arithmetic: A Framework for Test-time Safety Alignment of
  Language Models by Steering Parameters and Activations
Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations
Rima Hazra
Sayan Layek
Somnath Banerjee
Soujanya Poria
KELM
LLMSV
34
6
0
17 Jun 2024
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective
  Unlearning in LLMs
Split, Unlearn, Merge: Leveraging Data Attributes for More Effective Unlearning in LLMs
S. Kadhe
Farhan Ahmed
Dennis Wei
Nathalie Baracaldo
Inkit Padhi
MoMe
MU
28
7
0
17 Jun 2024
DELLA-Merging: Reducing Interference in Model Merging through
  Magnitude-Based Sampling
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Pala Tej Deep
Rishabh Bhardwaj
Soujanya Poria
MoMe
30
24
0
17 Jun 2024
MetaGPT: Merging Large Language Models Using Model Exclusive Task
  Arithmetic
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic
Yuyan Zhou
Liang Song
Bingning Wang
Weipeng Chen
MoMe
30
16
0
17 Jun 2024
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
Chenghao Fan
Zhenyi Lu
Wei Wei
Jie Tian
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
48
5
0
17 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
47
48
0
17 Jun 2024
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Rickard Brüel-Gabrielsson
Jiacheng Zhu
Onkar Bhardwaj
Leshem Choshen
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
45
5
0
17 Jun 2024
Towards Efficient Pareto Set Approximation via Mixture of Experts Based
  Model Fusion
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion
Anke Tang
Li Shen
Yong Luo
Shiwei Liu
Han Hu
Bo Du
MoMe
31
6
0
14 Jun 2024
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Diffusion Soup: Model Merging for Text-to-Image Diffusion Models
Benjamin Biggs
Arjun Seshadri
Yang Zou
Achin Jain
Aditya Golatkar
Yusheng Xie
Alessandro Achille
Ashwin Swaminathan
Stefano Soatto
MoMe
DiffM
40
10
0
12 Jun 2024
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Lu Li
T. Zhang
Zhiqi Bu
Suyuchen Wang
Huan He
Jie Fu
Yonghui Wu
Jiang Bian
Yong Chen
Yoshua Bengio
FedML
MoMe
100
3
0
11 Jun 2024
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
Anke Tang
Li Shen
Yong Luo
Han Hu
Bo Du
Dacheng Tao
ELM
MoMe
VLM
44
22
0
05 Jun 2024
Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles
Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles
Jiesong Lian
Yucong Huang
Chengdong Ma
Mingzhi Wang
Ying Wen
Long Hu
Yixue Hao
59
0
0
31 May 2024
Previous
123456
Next