Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2209.04836
Cited By
Git Re-Basin: Merging Models modulo Permutation Symmetries
11 September 2022
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Git Re-Basin: Merging Models modulo Permutation Symmetries"
30 / 80 papers shown
Title
Transformer Fusion with Optimal Transport
Moritz Imfeld
Jacopo Graldi
Marco Giordano
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
ViT
MoMe
22
16
0
09 Oct 2023
Jointly Training Large Autoregressive Multimodal Models
Emanuele Aiello
L. Yu
Yixin Nie
Armen Aghajanyan
Barlas Oğuz
13
29
0
27 Sep 2023
Geodesic Mode Connectivity
Charlie Tan
Theodore Long
Sarah Zhao
Rudolf Laine
11
2
0
24 Aug 2023
Shrink-Perturb Improves Architecture Mixing during Population Based Training for Neural Architecture Search
A. Chebykin
A. Dushatskiy
T. Alderliesten
Peter A. N. Bosman
29
0
0
28 Jul 2023
Layer-wise Linear Mode Connectivity
Linara Adilova
Maksym Andriushchenko
Michael Kamp
Asja Fischer
Martin Jaggi
FedML
FAtt
MoMe
28
15
0
13 Jul 2023
On The Impact of Machine Learning Randomness on Group Fairness
Prakhar Ganesh
Hong Chang
Martin Strobel
Reza Shokri
FaML
21
30
0
09 Jul 2023
Investigating how ReLU-networks encode symmetries
Georg Bökman
Fredrik Kahl
24
6
0
26 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jiménez
Alessandro Favero
P. Frossard
MoMe
37
103
0
22 May 2023
MGR: Multi-generator Based Rationalization
Wei Liu
Haozhao Wang
Jun Wang
Rui Li
Xinyang Li
Yuankai Zhang
Yang Qiu
19
7
0
08 May 2023
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
D. Honegger
Konstantin Schurholt
Damian Borth
20
4
0
26 Apr 2023
Elastic Weight Removal for Faithful and Abstractive Dialogue Generation
Nico Daheim
Nouha Dziri
Mrinmaya Sachan
Iryna Gurevych
E. Ponti
MoMe
26
30
0
30 Mar 2023
Deep Learning on Implicit Neural Representations of Shapes
Luca de Luigi
Adriano Cardace
Riccardo Spezialetti
Pierluigi Zama Ramirez
Samuele Salti
Luigi Di Stefano
21
46
0
10 Feb 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
26
49
0
09 Feb 2023
Equivariant Architectures for Learning in Deep Weight Spaces
Aviv Navon
Aviv Shamsian
Idan Achituve
Ethan Fetaya
Gal Chechik
Haggai Maron
30
63
0
30 Jan 2023
Training trajectories, mini-batch losses and the curious role of the learning rate
Mark Sandler
A. Zhmoginov
Max Vladymyrov
Nolan Miller
ODL
13
10
0
05 Jan 2023
Dataless Knowledge Fusion by Merging Weights of Language Models
Xisen Jin
Xiang Ren
Daniel Preotiuc-Pietro
Pengxiang Cheng
FedML
MoMe
13
211
0
19 Dec 2022
Editing Models with Task Arithmetic
Gabriel Ilharco
Marco Tulio Ribeiro
Mitchell Wortsman
Suchin Gururangan
Ludwig Schmidt
Hannaneh Hajishirzi
Ali Farhadi
KELM
MoMe
MU
43
424
0
08 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
26
52
0
02 Dec 2022
Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models
Mark Rofin
Nikita Balagansky
Daniil Gavrilov
MoMe
KELM
31
5
0
22 Nov 2022
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan
Hanie Sedghi
O. Saukh
R. Entezari
Behnam Neyshabur
MoMe
46
94
0
15 Nov 2022
Symmetries, flat minima, and the conserved quantities of gradient flow
Bo-Lu Zhao
I. Ganev
Robin G. Walters
Rose Yu
Nima Dehmamy
42
16
0
31 Oct 2022
lo-fi: distributed fine-tuning without communication
Mitchell Wortsman
Suchin Gururangan
Shen Li
Ali Farhadi
Ludwig Schmidt
Michael G. Rabbat
Ari S. Morcos
19
24
0
19 Oct 2022
Wasserstein Barycenter-based Model Fusion and Linear Mode Connectivity of Neural Networks
A. K. Akash
Sixu Li
Nicolas García Trillos
24
12
0
13 Oct 2022
Stochastic optimization on matrices and a graphon McKean-Vlasov limit
Zaïd Harchaoui
Sewoong Oh
Soumik Pal
Raghav Somani
Raghavendra Tripathi
20
2
0
02 Oct 2022
Random initialisations performing above chance and how to find them
Frederik Benzing
Simon Schug
Robert Meier
J. Oswald
Yassir Akram
Nicolas Zucchet
Laurence Aitchison
Angelika Steger
ODL
15
24
0
15 Sep 2022
Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion
Chengli Tan
Jiang Zhang
Junmin Liu
35
1
0
09 Jun 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
232
45
0
24 May 2022
Deep Networks on Toroids: Removing Symmetries Reveals the Structure of Flat Regions in the Landscape Geometry
Fabrizio Pittorino
Antonio Ferraro
Gabriele Perugini
Christoph Feinauer
Carlo Baldassi
R. Zecchina
199
24
0
07 Feb 2022
Optimizing Mode Connectivity via Neuron Alignment
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
MoMe
223
80
0
05 Sep 2020
Top-N Recommender System via Matrix Completion
Zhao Kang
Chong Peng
Q. Cheng
205
111
0
19 Jan 2016
Previous
1
2