Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.04863
Cited By
Knowledge is a Region in Weight Space for Fine-tuned Language Models
9 February 2023
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Knowledge is a Region in Weight Space for Fine-tuned Language Models"
38 / 38 papers shown
Title
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data
Fei Zhu
Zhaoxiang Zhang
OODD
UQCV
65
0
0
20 Apr 2025
Charting and Navigating Hugging Face's Model Atlas
Eliahu Horwitz
Nitzan Kurer
Jonathan Kahana
Liel Amar
Yedid Hoshen
41
0
0
13 Mar 2025
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object Identification
Vishnu Kabir Chhabra
Ding Zhu
Mohammad Mahdi Khalili
37
2
0
27 Feb 2025
Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models
Daiki Chijiwa
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Susumu Takeuchi
47
0
0
18 Feb 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
65
18
0
08 Jan 2025
Gradient Localization Improves Lifelong Pretraining of Language Models
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
36
1
0
07 Nov 2024
Local Contrastive Editing of Gender Stereotypes
Marlene Lutz
Rochelle Choenni
M. Strohmaier
Anne Lauscher
32
1
0
23 Oct 2024
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Joey Tianyi Zhou
Tsendsuren Munkhdalai
MoMe
46
15
0
04 Oct 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
35
6
0
26 Sep 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
74
1
0
30 Jul 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
56
14
0
24 Jun 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher
Ján Cegin
Róbert Belanec
Jakub Simko
Ivan Srba
M. Bieliková
41
1
0
18 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
42
48
0
17 Jun 2024
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
Feiteng Fang
Yuelin Bai
Shiwen Ni
Min Yang
Xiaojun Chen
Ruifeng Xu
AAML
RALM
39
30
0
31 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
35
4
0
29 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
50
13
0
22 May 2024
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
49
6
0
05 Apr 2024
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
Nikhil Prakash
Tamar Rott Shaham
Tal Haklay
Yonatan Belinkov
David Bau
46
52
0
22 Feb 2024
Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond
Xinyu Wang
Hainiu Xu
Lin Gui
Yulan He
MoMe
AIFin
36
1
0
22 Feb 2024
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
Shahar Katz
Yonatan Belinkov
Mor Geva
Lior Wolf
63
10
1
20 Feb 2024
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning
Haeju Lee
Minchan Jeong
SeYoung Yun
Kee-Eung Kim
AAML
VPVLM
53
2
0
13 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
111
93
0
22 Jan 2024
A Comprehensive Study of Knowledge Editing for Large Language Models
Ningyu Zhang
Yunzhi Yao
Bo Tian
Peng Wang
Shumin Deng
...
Lei Liang
Qing Cui
Xiao-Jun Zhu
Jun Zhou
Huajun Chen
KELM
41
76
0
02 Jan 2024
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
21
10
0
07 Dec 2023
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey
Garima Agrawal
Tharindu Kumarage
Zeyad Alghami
Huanmin Liu
37
81
0
14 Nov 2023
Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
Kerem Zaman
Leshem Choshen
Shashank Srivastava
KELM
MoMe
20
10
0
13 Nov 2023
RSVP: Customer Intent Detection via Agent Response Contrastive and Generative Pre-Training
Yu-Chien Tang
Wei-Yao Wang
An-Zi Yen
Wenjie Peng
26
1
0
15 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
26
33
0
02 Oct 2023
Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Dean Ninalga
24
2
0
24 Sep 2023
Derivative Free Weight-space Ensembling
Dean Ninalga
MoMe
26
0
0
07 Jul 2023
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Max Zimmer
Christoph Spiegel
Sebastian Pokutta
MoMe
41
14
0
29 Jun 2023
TIES-Merging: Resolving Interference When Merging Models
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Joey Tianyi Zhou
MoMe
40
250
0
02 Jun 2023
ZipIt! Merging Models from Different Tasks without Training
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLM
MoMe
44
110
0
04 May 2023
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
28
52
0
02 Dec 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
252
314
0
11 Sep 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
237
45
0
24 May 2022
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
255
620
0
04 Dec 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1