Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2211.00107
Cited By
Where to start? Analyzing the potential value of intermediate models
31 October 2022
Leshem Choshen
Elad Venezian
Shachar Don-Yehiya
Noam Slonim
Yoav Katz
MoMe
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Where to start? Analyzing the potential value of intermediate models"
23 / 23 papers shown
Title
Findings of the BabyLM Challenge: Sample-Efficient Pretraining on Developmentally Plausible Corpora
Alex Warstadt
Aaron Mueller
Leshem Choshen
E. Wilcox
Chengxu Zhuang
...
Rafael Mosquera
Bhargavi Paranjape
Adina Williams
Tal Linzen
Ryan Cotterell
54
110
0
10 Apr 2025
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
109
4
0
09 Dec 2024
Local Superior Soups: A Catalyst for Model Merging in Cross-Silo Federated Learning
Minghui Chen
Meirui Jiang
Xin Zhang
Qi Dou
Zehua Wang
Xiaoxiao Li
MoMe
FedML
53
2
0
31 Oct 2024
Model merging with SVD to tie the Knots
George Stoica
Pratik Ramesh
B. Ecsedi
Leshem Choshen
Judy Hoffman
MoMe
44
9
0
25 Oct 2024
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Joey Tianyi Zhou
Tsendsuren Munkhdalai
MoMe
46
16
0
04 Oct 2024
Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead
Rickard Brüel-Gabrielsson
Jiacheng Zhu
Onkar Bhardwaj
Leshem Choshen
Kristjan Greenewald
Mikhail Yurochkin
Justin Solomon
58
5
0
17 Jun 2024
IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification
Abdullah Alsuhaibani
Hamad Zogan
Imran Razzak
Shoaib Jameel
Guandong Xu
43
4
0
08 Jan 2024
RSVP: Customer Intent Detection via Agent Response Contrastive and Generative Pre-Training
Yu-Chien Tang
Wei-Yao Wang
An-Zi Yen
Wenjie Peng
34
1
0
15 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
46
53
0
27 Sep 2023
Efficient Benchmarking of Language Models
Yotam Perlitz
Elron Bandel
Ariel Gera
Ofir Arviv
L. Ein-Dor
Eyal Shnarch
Noam Slonim
Michal Shmueli-Scheuer
Leshem Choshen
ALM
29
24
0
22 Aug 2023
Leveraging Few-Shot Data Augmentation and Waterfall Prompting for Response Generation
Lea Krause
Selene Báez Santamaría
Michiel van der Meer
Urja Khurana
30
3
0
02 Aug 2023
How Different Is Stereotypical Bias Across Languages?
Ibrahim Tolga Ozturk
R. Nedelchev
C. Heumann
Esteban Garces Arias
Marius Roger
Bernd Bischl
Matthias Aßenmacher
51
2
0
14 Jul 2023
Towards Robust and Efficient Continual Language Learning
Adam Fisch
Amal Rannen-Triki
Razvan Pascanu
J. Bornschein
Angeliki Lazaridou
E. Gribovskaya
MarcÁurelio Ranzato
CLL
37
1
0
11 Jul 2023
Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models
Nikhil Kandpal
Brian Lester
Mohammed Muqeeth
Anisha Mascarenhas
Monty Evans
Vishal Baskaran
Tenghao Huang
Haokun Liu
Colin Raffel
VLM
29
10
0
07 Jun 2023
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
42
47
0
06 Jun 2023
QAID: Question Answering Inspired Few-shot Intent Detection
Asaf Yehudai
Matan Vetzler
Y. Mass
Koren Lazar
Doron Cohen
Boaz Carmeli
25
7
0
02 Mar 2023
Competence-Based Analysis of Language Models
Adam Davies
Jize Jiang
Chengxiang Zhai
ELM
34
4
0
01 Mar 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
39
50
0
09 Feb 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
42
82
0
20 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
33
52
0
02 Dec 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
244
45
0
24 May 2022
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
287
625
0
04 Dec 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
1