Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.09832
Cited By
Merging Models with Fisher-Weighted Averaging
18 November 2021
Michael Matena
Colin Raffel
FedML
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Merging Models with Fisher-Weighted Averaging"
50 / 283 papers shown
Title
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
21
10
0
07 Dec 2023
Continuous 16-bit Training: Accelerating 32-bit Pre-Trained Neural Networks
Juyoung Yun
9
0
0
30 Nov 2023
Efficient Stitchable Task Adaptation
Haoyu He
Zizheng Pan
Jing Liu
Jianfei Cai
Bohan Zhuang
31
3
0
29 Nov 2023
LM-Cocktail: Resilient Tuning of Language Models via Model Merging
Shitao Xiao
Zheng Liu
Peitian Zhang
Xingrun Xing
MoMe
KELM
94
24
0
22 Nov 2023
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Prateek Yadav
Leshem Choshen
Colin Raffel
Mohit Bansal
32
13
0
22 Nov 2023
Leveraging Function Space Aggregation for Federated Learning at Scale
Nikita Dhawan
Nicole Mitchell
Zachary B. Charles
Zachary Garrett
Gintare Karolina Dziugaite
FedML
24
3
0
17 Nov 2023
Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization
Alexandra Chronopoulou
Jonas Pfeiffer
Joshua Maynez
Xinyi Wang
Sebastian Ruder
Priyanka Agrawal
MoMe
26
14
0
15 Nov 2023
Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
Kerem Zaman
Leshem Choshen
Shashank Srivastava
KELM
MoMe
25
10
0
13 Nov 2023
L3 Ensembles: Lifelong Learning Approach for Ensemble of Foundational Language Models
Aidin Shiri
Kaushik Roy
Amit P. Sheth
Manas Gaur
KELM
30
4
0
11 Nov 2023
Cross-Silo Federated Learning Across Divergent Domains with Iterative Parameter Alignment
Matt Gorbett
Hossein Shirazi
Indrakshi Ray
FedML
36
2
0
08 Nov 2023
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
28
272
0
06 Nov 2023
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP
Yoshitomo Matsubara
VLM
26
1
0
26 Oct 2023
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary
Myeongjun Jang
Thomas Lukasiewicz
27
4
0
24 Oct 2023
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
Haoxiang Wang
Pavan Kumar Anasosalu Vasu
Fartash Faghri
Raviteja Vemulapalli
Mehrdad Farajtabar
Sachin Mehta
Mohammad Rastegari
Oncel Tuzel
Hadi Pouransari
VLM
32
67
0
23 Oct 2023
Model Merging by Uncertainty-Based Gradient Matching
Nico Daheim
Thomas Möllenhoff
E. Ponti
Iryna Gurevych
Mohammad Emtiyaz Khan
MoMe
FedML
32
43
0
19 Oct 2023
Transformer Fusion with Optimal Transport
Moritz Imfeld
Jacopo Graldi
Marco Giordano
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
ViT
MoMe
29
16
0
09 Oct 2023
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
Shih-Cheng Huang
Pin-Zu Li
Yu-Chi Hsu
Kuang-Ming Chen
Yu Tung Lin
Shih-Kai Hsiao
Richard Tzong-Han Tsai
Hung-yi Lee
MoMe
34
13
0
07 Oct 2023
Parameter Efficient Multi-task Model Fusion with Partial Linearization
Anke Tang
Li Shen
Yong Luo
Yibing Zhan
Han Hu
Bo Du
Yixin Chen
Dacheng Tao
MoMe
26
30
0
07 Oct 2023
AdaMerging: Adaptive Model Merging for Multi-Task Learning
Enneng Yang
Zhenyi Wang
Li Shen
Shiwei Liu
Guibing Guo
Xingwei Wang
Dacheng Tao
MoMe
35
97
0
04 Oct 2023
BYOM: Building Your Own Multi-Task Model For Free
Weisen Jiang
Baijiong Lin
Han Shi
Yu Zhang
Zhenguo Li
James T. Kwok
MoMe
37
5
0
03 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
26
33
0
02 Oct 2023
FedLPA: One-shot Federated Learning with Layer-Wise Posterior Aggregation
Xiang Liu
Liangxi Liu
Feiyang Ye
Yunheng Shen
Xia Li
Linshan Jiang
Jialin Li
30
4
0
30 Sep 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
33
52
0
27 Sep 2023
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
19
29
0
27 Sep 2023
Do the Frankenstein, or how to achieve better out-of-distribution performance with manifold mixing model soup
Hannes Fassold
MoMe
UQCV
26
2
0
28 Aug 2023
Mode Combinability: Exploring Convex Combinations of Permutation Aligned Models
Adrián Csiszárik
M. Kiss
Péter Korösi-Szabó
Márton Muntag
Gergely Papp
D. Varga
MoMe
24
1
0
22 Aug 2023
ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse
Yi-Kai Zhang
Lu Ren
Chao Yi
Qiwen Wang
De-Chuan Zhan
Han-Jia Ye
23
2
0
17 Aug 2023
Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation
Xinshuo Hu
Dongfang Li
Baotian Hu
Zihao Zheng
Zhenyu Liu
M. Zhang
KELM
MU
33
26
0
16 Aug 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
61
42
0
30 Jul 2023
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Chengsong Huang
Qian Liu
Bill Yuchen Lin
Tianyu Pang
Chao Du
Min-Bin Lin
MoMe
38
185
0
25 Jul 2023
Tangent Model Composition for Ensembling and Continual Fine-tuning
Tianlin Liu
Stefano Soatto
LRM
MoMe
CLL
27
15
0
16 Jul 2023
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Max Zimmer
Christoph Spiegel
Sebastian Pokutta
MoMe
41
14
0
29 Jun 2023
Composing Parameter-Efficient Modules with Arithmetic Operations
Jinghan Zhang
Shiqi Chen
Junteng Liu
Junxian He
KELM
MoMe
26
109
0
26 Jun 2023
Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
A. Jaiswal
Shiwei Liu
Tianlong Chen
Ying Ding
Zhangyang Wang
VLM
32
22
0
18 Jun 2023
Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models
Nikhil Kandpal
Brian Lester
Mohammed Muqeeth
Anisha Mascarenhas
Monty Evans
Vishal Baskaran
Tenghao Huang
Haokun Liu
Colin Raffel
VLM
16
10
0
07 Jun 2023
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Alexandre Ramé
Guillaume Couairon
Mustafa Shukor
Corentin Dancette
Jean-Baptiste Gaya
Laure Soulier
Matthieu Cord
MoMe
35
136
0
07 Jun 2023
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
34
45
0
06 Jun 2023
TIES-Merging: Resolving Interference When Merging Models
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Joey Tianyi Zhou
MoMe
42
253
0
02 Jun 2023
Efficient Storage of Fine-Tuned Models via Low-Rank Approximation of Weight Residuals
Simo Ryu
S. Seo
Jaejun Yoo
29
5
0
28 May 2023
Emergent Modularity in Pre-trained Transformers
Zhengyan Zhang
Zhiyuan Zeng
Yankai Lin
Chaojun Xiao
Xiaozhi Wang
Xu Han
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Jie Zhou
MoE
47
23
0
28 May 2023
Free Lunch: Robust Cross-Lingual Transfer via Model Checkpoint Averaging
Fabian David Schmidt
Ivan Vulić
Goran Glavavs
24
8
0
26 May 2023
Domain Aligned Prefix Averaging for Domain Generalization in Abstractive Summarization
Pranav Ajit Nair
Sukomal Pal
Pradeepika Verm
MoMe
34
2
0
26 May 2023
Transferring Learning Trajectories of Neural Networks
Daiki Chijiwa
28
2
0
23 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
51
110
0
22 May 2023
Knowledge Card: Filling LLMs' Knowledge Gaps with Plug-in Specialized Language Models
Shangbin Feng
Weijia Shi
Yuyang Bai
Vidhisha Balachandran
Tianxing He
Yulia Tsvetkov
KELM
50
28
0
17 May 2023
Simplifying Distributed Neural Network Training on Massive Graphs: Randomized Partitions Improve Model Aggregation
Jiong Zhu
Aishwarya N. Reganti
E-Wen Huang
Charles Dickens
Nikhil S. Rao
Karthik Subbian
Danai Koutra
GNN
FedML
37
3
0
17 May 2023
ZipIt! Merging Models from Different Tasks without Training
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLM
MoMe
46
111
0
04 May 2023
An Empirical Study of Multimodal Model Merging
Yi-Lin Sung
Linjie Li
Kevin Qinghong Lin
Zhe Gan
Joey Tianyi Zhou
Lijuan Wang
MoMe
20
40
0
28 Apr 2023
PopulAtion Parameter Averaging (PAPA)
Alexia Jolicoeur-Martineau
Emy Gervais
Kilian Fatras
Yan Zhang
Simon Lacoste-Julien
MoMe
40
17
0
06 Apr 2023
UKP-SQuARE v3: A Platform for Multi-Agent QA Research
Haritz Puerto
Tim Baumgärtner
Rachneet Sachdeva
Haishuo Fang
Haotian Zhang
Sewin Tariverdian
Kexin Wang
Iryna Gurevych
26
2
0
31 Mar 2023
Previous
1
2
3
4
5
6
Next