ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2310.12808
  4. Cited By
Model Merging by Uncertainty-Based Gradient Matching

Model Merging by Uncertainty-Based Gradient Matching

19 October 2023
Nico Daheim
Thomas Möllenhoff
Edoardo Ponti
Iryna Gurevych
Mohammad Emtiyaz Khan
    MoMeFedML
ArXiv (abs)PDFHTML

Papers citing "Model Merging by Uncertainty-Based Gradient Matching"

32 / 32 papers shown
Title
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
198
7
0
08 Mar 2025
SplatPose: Geometry-Aware 6-DoF Pose Estimation from Single RGB Image via 3D Gaussian Splatting
Linqi Yang
Xiongwei Zhao
Qihao Sun
Ke Wang
Ao Chen
Peng Kang
3DGS
134
6
0
07 Mar 2025
GNNMerge: Merging of GNN Models Without Accessing Training Data
GNNMerge: Merging of GNN Models Without Accessing Training Data
Vipul Garg
Ishita Thakre
Sayan Ranu
MoMe
176
0
0
05 Mar 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
200
1
0
01 Feb 2025
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
291
125
0
28 Jan 2025
Task Singular Vectors: Reducing Task Interference in Model Merging
Task Singular Vectors: Reducing Task Interference in Model Merging
Antonio Andrea Gargiulo
Donato Crisostomi
Maria Sofia Bucarelli
Simone Scardapane
Fabrizio Silvestri
Emanuele Rodolà
MoMe
147
16
0
26 Nov 2024
ATM: Improving Model Merging by Alternating Tuning and Merging
ATM: Improving Model Merging by Alternating Tuning and Merging
Luca Zhou
Daniele Solombrino
Donato Crisostomi
Maria Sofia Bucarelli
Fabrizio Silvestri
Emanuele Rodolà
MoMe
127
5
0
05 Nov 2024
CRoP: Context-wise Robust Static Human-Sensing Personalization
CRoP: Context-wise Robust Static Human-Sensing Personalization
Sawinder Kaur
Avery Gump
Yi Xiao
Jingyu Xin
Harshit Sharma
Nina R Benway
Jonathan L Preston
Asif Salekin
111
0
0
26 Sep 2024
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Nico Daheim
Nouha Dziri
Mrinmaya Sachan
Iryna Gurevych
Edoardo Ponti
MoMe
103
30
0
30 Mar 2023
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELMMoMeMU
203
521
0
08 Dec 2022
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLMLRM
234
3,158
0
20 Oct 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
318
344
0
11 Sep 2022
Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Alan Ansell
Edoardo Ponti
Anna Korhonen
Ivan Vulić
CLLMoE
141
143
0
14 Oct 2021
Robust fine-tuning of zero-shot models
Robust fine-tuning of zero-shot models
Mitchell Wortsman
Gabriel Ilharco
Jong Wook Kim
Mike Li
Simon Kornblith
...
Raphael Gontijo-Lopes
Hannaneh Hajishirzi
Ali Farhadi
Hongseok Namkoong
Ludwig Schmidt
VLM
169
739
0
04 Sep 2021
$Q^{2}$: Evaluating Factual Consistency in Knowledge-Grounded Dialogues
  via Question Generation and Question Answering
Q2Q^{2}Q2: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question Answering
Or Honovich
Leshem Choshen
Roee Aharoni
Ella Neeman
Idan Szpektor
Omri Abend
HILM
88
141
0
16 Apr 2021
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language
  Models
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Samuel Gehman
Suchin Gururangan
Maarten Sap
Yejin Choi
Noah A. Smith
170
1,221
0
24 Sep 2020
What is being transferred in transfer learning?
What is being transferred in transfer learning?
Behnam Neyshabur
Hanie Sedghi
Chiyuan Zhang
122
528
0
26 Aug 2020
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
MoMe
163
630
0
11 Dec 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
506
20,376
0
23 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
700
24,572
0
26 Jul 2019
Practical Deep Learning with Bayesian Principles
Practical Deep Learning with Bayesian Principles
Kazuki Osawa
S. Swaroop
Anirudh Jain
Runa Eschenhagen
Richard Turner
Rio Yokota
Mohammad Emtiyaz Khan
BDLUQCV
153
248
0
06 Jun 2019
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam
Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam
Mohammad Emtiyaz Khan
Didrik Nielsen
Voot Tangkaratt
Wu Lin
Y. Gal
Akash Srivastava
ODL
177
272
0
13 Jun 2018
Progress & Compress: A scalable framework for continual learning
Progress & Compress: A scalable framework for continual learning
Jonathan Richard Schwarz
Jelena Luketina
Wojciech M. Czarnecki
A. Grabska-Barwinska
Yee Whye Teh
Razvan Pascanu
R. Hadsell
CLL
129
889
0
16 May 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedMLMoMe
143
1,673
0
14 Mar 2018
On Quadratic Penalties in Elastic Weight Consolidation
On Quadratic Penalties in Elastic Weight Consolidation
Ferenc Huszár
69
100
0
11 Dec 2017
Understanding Black-box Predictions via Influence Functions
Understanding Black-box Predictions via Influence Functions
Pang Wei Koh
Percy Liang
TDI
225
2,910
0
14 Mar 2017
Remote Sensing Image Scene Classification: Benchmark and State of the
  Art
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
108
2,269
0
01 Mar 2017
Overcoming catastrophic forgetting in neural networks
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
374
7,587
0
02 Dec 2016
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
349
2,900
0
26 Sep 2016
Distributed Gaussian Processes
Distributed Gaussian Processes
M. Deisenroth
Jun Wei Ng
GP
88
342
0
10 Feb 2015
Describing Textures in the Wild
Describing Textures in the Wild
Mircea Cimpoi
Subhransu Maji
Iasonas Kokkinos
S. Mohamed
Andrea Vedaldi
3DV
151
2,695
0
14 Nov 2013
Revisiting Natural Gradient for Deep Networks
Revisiting Natural Gradient for Deep Networks
Razvan Pascanu
Yoshua Bengio
ODL
255
389
0
16 Jan 2013
1