ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.16703
  4. Cited By
Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs

Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs

22 May 2025
Zeping Yu
Sophia Ananiadou
    MoMeKELMCLL
ArXiv (abs)PDFHTML

Papers citing "Locate-then-Merge: Neuron-Level Parameter Fusion for Mitigating Catastrophic Forgetting in Multimodal LLMs"

34 / 34 papers shown
Title
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced
  Multimodal Understanding
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Z. F. Wu
Xiaokang Chen
Zizheng Pan
Xianglong Liu
Wen Liu
...
Xingkai Yu
Haowei Zhang
Liang Zhao
Yijiao Wang
Chong Ruan
MLLMVLMMoE
184
158
0
13 Dec 2024
Training-Free Mitigation of Language Reasoning Degradation After
  Multimodal Instruction Tuning
Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning
Neale Ratzlaff
Man Luo
Xin Su
Vasudev Lal
Phillip Howard
VLMMLLMLRM
112
2
0
04 Dec 2024
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics
Yaniv Nikankin
Anja Reusch
Aaron Mueller
Yonatan Belinkov
AIFinLRM
115
33
0
28 Oct 2024
Interpreting Arithmetic Mechanism in Large Language Models through
  Comparative Neuron Analysis
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Zeping Yu
Sophia Ananiadou
LRMMILM
100
14
0
21 Sep 2024
Mitigating Catastrophic Forgetting in Language Transfer via Model
  Merging
Mitigating Catastrophic Forgetting in Language Transfer via Model Merging
Anton Alexandrov
Veselin Raychev
Mark Niklas Muller
Ce Zhang
Martin Vechev
Kristina Toutanova
MoMeCLLKELM
86
20
0
11 Jul 2024
DELLA-Merging: Reducing Interference in Model Merging through
  Magnitude-Based Sampling
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Pala Tej Deep
Rishabh Bhardwaj
Soujanya Poria
MoMe
75
31
0
17 Jun 2024
Wings: Learning Multimodal LLMs without Text-only Forgetting
Wings: Learning Multimodal LLMs without Text-only Forgetting
Yi-Kai Zhang
Shiyin Lu
Yang Li
Yanqing Ma
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
De-Chuan Zhan
Han-Jia Ye
VLM
94
10
0
05 Jun 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMeKELM
157
101
0
20 Mar 2024
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO
  and Toxicity
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Andrew Lee
Xiaoyan Bai
Itamar Pres
Martin Wattenberg
Jonathan K. Kummerfeld
Rada Mihalcea
126
121
0
03 Jan 2024
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
88
70
0
11 Dec 2023
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning
  Benchmark for Expert AGI
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Xiang Yue
Yuansheng Ni
Kai Zhang
Tianyu Zheng
Ruoqi Liu
...
Yibo Liu
Wenhao Huang
Huan Sun
Yu-Chuan Su
Wenhu Chen
OSLMELMVLM
266
960
0
27 Nov 2023
Identifying and Mitigating Vulnerabilities in LLM-Integrated
  Applications
Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications
Fengqing Jiang
Zhangchen Xu
Luyao Niu
Wei Ping
Jinyuan Jia
Bo Li
Radha Poovendran
AAML
91
36
0
07 Nov 2023
Language Models are Super Mario: Absorbing Abilities from Homologous
  Models as a Free Lunch
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
111
335
0
06 Nov 2023
Improved Baselines with Visual Instruction Tuning
Improved Baselines with Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLMMLLM
171
2,825
0
05 Oct 2023
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
Yun Luo
Zhen Yang
Fandong Meng
Yafu Li
Jie Zhou
Yue Zhang
CLLKELM
190
318
0
17 Aug 2023
Multimodal Neurons in Pretrained Text-Only Transformers
Multimodal Neurons in Pretrained Text-Only Transformers
Sarah Schwettmann
Neil Chowdhury
Samuel J. Klein
David Bau
Antonio Torralba
MILM
81
32
0
03 Aug 2023
Learn to Explain: Multimodal Reasoning via Thought Chains for Science
  Question Answering
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
Ashwin Kalyan
ELMReLMLRM
290
1,299
0
20 Sep 2022
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts
  in the Vocabulary Space
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
Mor Geva
Avi Caciularu
Ke Wang
Yoav Goldberg
KELM
130
387
0
28 Mar 2022
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLMALM
888
13,207
0
04 Mar 2022
Knowledge Neurons in Pretrained Transformers
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELMMU
110
465
0
18 Apr 2021
Learning Transferable Visual Models From Natural Language Supervision
Learning Transferable Visual Models From Natural Language Supervision
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
...
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIPVLM
999
29,926
0
26 Feb 2021
Transformer Feed-Forward Layers Are Key-Value Memories
Transformer Feed-Forward Layers Are Key-Value Memories
Mor Geva
R. Schuster
Jonathan Berant
Omer Levy
KELM
182
848
0
29 Dec 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
889
42,463
0
28 May 2020
GLU Variants Improve Transformer
GLU Variants Improve Transformer
Noam M. Shazeer
154
1,022
0
12 Feb 2020
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OODLRM
186
1,847
0
26 Nov 2019
CommonsenseQA: A Question Answering Challenge Targeting Commonsense
  Knowledge
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor
Jonathan Herzig
Nicholas Lourie
Jonathan Berant
RALM
146
1,752
0
02 Nov 2018
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book
  Question Answering
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Todor Mihaylov
Peter Clark
Tushar Khot
Ashish Sabharwal
122
1,570
0
08 Sep 2018
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELMRALMLRM
198
2,670
0
14 Mar 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
272
3,488
0
09 Mar 2018
Measuring Catastrophic Forgetting in Neural Networks
Measuring Catastrophic Forgetting in Neural Networks
Ronald Kemker
Marc McClure
Angelina Abitino
Tyler L. Hayes
Christopher Kanan
CLL
125
722
0
07 Aug 2017
RACE: Large-scale ReAding Comprehension Dataset From Examinations
RACE: Large-scale ReAding Comprehension Dataset From Examinations
Guokun Lai
Qizhe Xie
Hanxiao Liu
Yiming Yang
Eduard H. Hovy
ELM
198
1,359
0
15 Apr 2017
Overcoming catastrophic forgetting in neural networks
Overcoming catastrophic forgetting in neural networks
J. Kirkpatrick
Razvan Pascanu
Neil C. Rabinowitz
J. Veness
Guillaume Desjardins
...
A. Grabska-Barwinska
Demis Hassabis
Claudia Clopath
D. Kumaran
R. Hadsell
CLL
374
7,572
0
02 Dec 2016
VQA: Visual Question Answering
VQA: Visual Question Answering
Aishwarya Agrawal
Jiasen Lu
Stanislaw Antol
Margaret Mitchell
C. L. Zitnick
Dhruv Batra
Devi Parikh
CoGe
233
5,509
0
03 May 2015
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based
  Neural Networks
An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks
Ian Goodfellow
M. Berk Mirza
Xia Da
Aaron Courville
Yoshua Bengio
159
1,455
0
21 Dec 2013
1