Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.01708
Cited By
TIES-Merging: Resolving Interference When Merging Models
2 June 2023
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Joey Tianyi Zhou
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"TIES-Merging: Resolving Interference When Merging Models"
21 / 221 papers shown
Title
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
49
6
0
05 Apr 2024
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better
En-hao Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Shuaiqi Wang
...
Sergey Yekhanin
Guohao Dai
Huazhong Yang
Yu-Xiang Wang
Yu Wang
MoMe
57
4
0
02 Apr 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
45
7
0
30 Mar 2024
Simple and Scalable Strategies to Continually Pre-train Large Language Models
Adam Ibrahim
Benjamin Thérien
Kshitij Gupta
Mats L. Richter
Quentin Anthony
Timothée Lesort
Eugene Belilovsky
Irina Rish
KELM
CLL
44
54
0
13 Mar 2024
SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data
Jialu Li
Jaemin Cho
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
MoMe
DiffM
47
8
0
11 Mar 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
37
10
0
26 Feb 2024
Knowledge Fusion of Chat LLMs: A Preliminary Technical Report
Fanqi Wan
Ziyi Yang
Longguang Zhong
Xiaojun Quan
Xinting Huang
Wei Bi
MoMe
27
1
0
25 Feb 2024
Learning to Route Among Specialized Experts for Zero-Shot Generalization
Mohammed Muqeeth
Haokun Liu
Yufan Liu
Colin Raffel
MoMe
37
34
0
08 Feb 2024
On the Emergence of Cross-Task Linearity in the Pretraining-Finetuning Paradigm
Zhanpeng Zhou
Zijun Chen
Yilan Chen
Bo-Wen Zhang
Junchi Yan
MoMe
19
10
0
06 Feb 2024
Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Mohammad-Javad Davari
Eugene Belilovsky
MoMe
48
56
0
11 Dec 2023
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
Prateek Yadav
Leshem Choshen
Colin Raffel
Mohit Bansal
32
13
0
22 Nov 2023
Language and Task Arithmetic with Parameter-Efficient Layers for Zero-Shot Summarization
Alexandra Chronopoulou
Jonas Pfeiffer
Joshua Maynez
Xinyi Wang
Sebastian Ruder
Priyanka Agrawal
MoMe
26
16
0
15 Nov 2023
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Chengsong Huang
Qian Liu
Bill Yuchen Lin
Tianyu Pang
Chao Du
Min Lin
MoMe
38
182
0
25 Jul 2023
Diverse Weight Averaging for Out-of-Distribution Generalization
Alexandre Ramé
Matthieu Kirchmeyer
Thibaud Rahier
A. Rakotomamonjy
Patrick Gallinari
Matthieu Cord
OOD
199
128
0
19 May 2022
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach
Victor Sanh
Zheng-Xin Yong
Albert Webson
Colin Raffel
...
Khalid Almubarak
Xiangru Tang
Dragomir R. Radev
Mike Tian-Jian Jiang
Alexander M. Rush
VLM
225
339
0
02 Feb 2022
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
215
1,661
0
15 Oct 2021
Efficiently Identifying Task Groupings for Multi-Task Learning
Christopher Fifty
Ehsan Amid
Zhe Zhao
Tianhe Yu
Rohan Anil
Chelsea Finn
219
238
1
10 Sep 2021
SWAD: Domain Generalization by Seeking Flat Minima
Junbum Cha
Sanghyuk Chun
Kyungjae Lee
Han-Cheol Cho
Seunghyun Park
Yunsung Lee
Sungrae Park
MoMe
216
424
0
17 Feb 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
144
684
0
31 Jan 2021
Optimizing Mode Connectivity via Neuron Alignment
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
MoMe
225
80
0
05 Sep 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
Previous
1
2
3
4
5