Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.09891
Cited By
Fisher Mask Nodes for Language Model Merging
14 March 2024
Thennal D K
Ganesh Nathan
Suchithra M S
MoMe
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fisher Mask Nodes for Language Model Merging"
12 / 12 papers shown
Title
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
120
5
0
08 Mar 2025
Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace
Jinluan Yang
Anke Tang
Didi Zhu
Zhengyu Chen
Li Shen
Leilei Gan
MoMe
AAML
112
4
0
17 Oct 2024
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
268
326
0
11 Sep 2022
Fusing finetuned models for better pretraining
Leshem Choshen
Elad Venezian
Noam Slonim
Yoav Katz
FedML
AI4CE
MoMe
97
93
0
06 Apr 2022
The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
R. Entezari
Hanie Sedghi
O. Saukh
Behnam Neyshabur
MoMe
65
226
0
12 Oct 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
117
836
0
14 Jun 2021
Measuring Data Leakage in Machine-Learning Models with Fisher Information
Awni Y. Hannun
Chuan Guo
Laurens van der Maaten
FedML
MIACV
27
55
0
23 Feb 2021
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
267
441
0
17 Feb 2021
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
419
24,160
0
26 Jul 2019
A Survey on Multi-Task Learning
Yu Zhang
Qiang Yang
AIMat
373
2,196
0
25 Jul 2017
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
283
7,410
0
02 Dec 2016
Communication-Efficient Learning of Deep Networks from Decentralized Data
H. B. McMahan
Eider Moore
Daniel Ramage
S. Hampson
Blaise Agüera y Arcas
FedML
239
17,328
0
17 Feb 2016
1