Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.11903
Cited By
Deep Fusion: Efficient Network Training via Pre-trained Initializations
20 June 2023
Hanna Mazzawi
X. Gonzalvo
Michael Wunder
Sammy Jerome
Benoit Dherin
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Fusion: Efficient Network Training via Pre-trained Initializations"
3 / 3 papers shown
Title
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
37
8
0
11 Aug 2023
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu
Liyuan Liu
Hongkun Yu
Jing Li
Cheng Chen
Jiawei Han
VLM
69
51
0
23 Oct 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
246
4,489
0
23 Jan 2020
1