ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.14389
  4. Cited By
SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task
  Learning with Deep Representation Surgery

SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery

18 October 2024
Enneng Yang
Li Shen
Zhenyi Wang
G. Guo
Xingwei Wang
Xiaocun Cao
Jie Zhang
Dacheng Tao
    MoMe
ArXiv (abs)PDFHTMLGithub (5★)

Papers citing "SurgeryV2: Bridging the Gap Between Model Merging and Multi-Task Learning with Deep Representation Surgery"

30 / 30 papers shown
Title
Scalable Model Merging with Progressive Layer-wise Distillation
Scalable Model Merging with Progressive Layer-wise Distillation
Jing Xu
Jiazheng Li
J.N. Zhang
MoMeFedML
314
2
0
18 Feb 2025
Evolutionary Optimization of Model Merging Recipes
Evolutionary Optimization of Model Merging Recipes
Takuya Akiba
Makoto Shing
Yujin Tang
Qi Sun
David Ha
MoMe
289
125
0
28 Jan 2025
EMR-Merging: Tuning-Free High-Performance Model Merging
EMR-Merging: Tuning-Free High-Performance Model Merging
Chenyu Huang
Peng Ye
Tao Chen
Tong He
Xiangyu Yue
Wanli Ouyang
MoMe
78
45
0
23 May 2024
Language Models are Super Mario: Absorbing Abilities from Homologous
  Models as a Free Lunch
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
107
333
0
06 Nov 2023
Parameter Efficient Multi-task Model Fusion with Partial Linearization
Parameter Efficient Multi-task Model Fusion with Partial Linearization
Anke Tang
Li Shen
Yong Luo
Yibing Zhan
Han Hu
Bo Du
Yixin Chen
Dacheng Tao
MoMe
89
36
0
07 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its
  Routing Policy
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
69
39
0
02 Oct 2023
Rewarded soups: towards Pareto-optimal alignment by interpolating
  weights fine-tuned on diverse rewards
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Alexandre Ramé
Guillaume Couairon
Mustafa Shukor
Corentin Dancette
Jean-Baptiste Gaya
Laure Soulier
Matthieu Cord
MoMe
85
156
0
07 Jun 2023
Editing Models with Task Arithmetic
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELMMoMeMU
197
518
0
08 Dec 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
82
101
0
15 Nov 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
298
343
0
11 Sep 2022
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient
  Magnitudes of Auxiliary Tasks
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks
Yun He
Xuening Feng
Cheng Cheng
Geng Ji
Yunsong Guo
James Caverlee
53
43
0
14 Mar 2022
Multi-Task Learning in Natural Language Processing: An Overview
Multi-Task Learning in Natural Language Processing: An Overview
Shijie Chen
Yu Zhang
Qiang Yang
AIMat
120
108
0
19 Sep 2021
LoRA: Low-Rank Adaptation of Large Language Models
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRLAI4TSAI4CEALMAIMat
490
10,496
0
17 Jun 2021
Continual Learning for Text Classification with Information
  Disentanglement Based Regularization
Continual Learning for Text Classification with Information Disentanglement Based Regularization
Yufan Huang
Yanzhe Zhang
Jiaao Chen
Xuezhi Wang
Diyi Yang
CLL
64
112
0
12 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
972
29,810
0
26 Feb 2021
An Image is Worth 16x16 Words: Transformers for Image Recognition at
  Scale
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy
Lucas Beyer
Alexander Kolesnikov
Dirk Weissenborn
Xiaohua Zhai
...
Matthias Minderer
G. Heigold
Sylvain Gelly
Jakob Uszkoreit
N. Houlsby
ViT
673
41,430
0
22 Oct 2020
Gradient Vaccine: Investigating and Improving Multi-task Optimization in
  Massively Multilingual Models
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models
Zirui Wang
Yulia Tsvetkov
Orhan Firat
Yuan Cao
72
202
0
12 Oct 2020
A Survey on Negative Transfer
A Survey on Negative Transfer
Wen Zhang
Lingfei Deng
Lei Zhang
Dongrui Wu
AAML
100
221
0
02 Sep 2020
Knowledge Distillation for Multi-task Learning
Knowledge Distillation for Multi-task Learning
Weihong Li
Hakan Bilen
MoMe
54
73
0
14 Jul 2020
MTL-NAS: Task-Agnostic Neural Architecture Search towards
  General-Purpose Multi-Task Learning
MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning
Yuan Gao
Haoping Bai
Zequn Jie
Jiayi Ma
Kui Jia
Wei Liu
77
95
0
31 Mar 2020
Multi-task self-supervised learning for Robust Speech Recognition
Multi-task self-supervised learning for Robust Speech Recognition
Mirco Ravanelli
Jianyuan Zhong
Santiago Pascual
P. Swietojanski
João Monteiro
J. Trmal
Yoshua Bengio
SSL
281
290
0
25 Jan 2020
Gradient Surgery for Multi-Task Learning
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
180
1,228
0
19 Jan 2020
Model Fusion via Optimal Transport
Model Fusion via Optimal Transport
Sidak Pal Singh
Martin Jaggi
MoMeFedML
116
240
0
12 Oct 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,175
0
11 Oct 2018
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and
  Land Cover Classification
EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification
P. Helber
B. Bischke
Andreas Dengel
Damian Borth
154
1,830
0
31 Aug 2017
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry
  and Semantics
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Alex Kendall
Y. Gal
R. Cipolla
3DH
272
3,135
0
19 May 2017
Remote Sensing Image Scene Classification: Benchmark and State of the
  Art
Remote Sensing Image Scene Classification: Benchmark and State of the Art
Gong Cheng
Junwei Han
Xiaoqiang Lu
106
2,264
0
01 Mar 2017
Communication-Efficient Learning of Deep Networks from Decentralized
  Data
Communication-Efficient Learning of Deep Networks from Decentralized Data
H. B. McMahan
Eider Moore
Daniel Ramage
S. Hampson
Blaise Agüera y Arcas
FedML
408
17,593
0
17 Feb 2016
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,312
0
22 Dec 2014
Describing Textures in the Wild
Describing Textures in the Wild
Mircea Cimpoi
Subhransu Maji
Iasonas Kokkinos
S. Mohamed
Andrea Vedaldi
3DV
146
2,689
0
14 Nov 2013
1