Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.02869
Cited By
Masked Structural Growth for 2x Faster Language Model Pre-training
4 May 2023
Yiqun Yao
Zheng-Wei Zhang
Jing Li
Yequan Wang
OffRL
AI4CE
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Structural Growth for 2x Faster Language Model Pre-training"
7 / 7 papers shown
Title
Frozen Layers: Memory-efficient Many-fidelity Hyperparameter Optimization
Timur Carstensen
Neeratyoy Mallik
Frank Hutter
Martin Rapp
AI4CE
30
0
0
14 Apr 2025
STEP: Staged Parameter-Efficient Pre-training for Large Language Models
Kazuki Yano
Takumi Ito
Jun Suzuki
LRM
47
1
0
05 Apr 2025
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Chenze Shao
Fandong Meng
Jie Zhou
49
1
0
17 Jul 2024
FLM-101B: An Open LLM and How to Train It with
100
K
B
u
d
g
e
t
100K Budget
100
K
B
u
d
g
e
t
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
Li Du
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
60
21
0
07 Sep 2023
Composable Function-preserving Expansions for Transformer Architectures
Andrea Gesmundo
Kaitlin Maile
AI4CE
40
8
0
11 Aug 2023
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu
Liyuan Liu
Hongkun Yu
Jing Li
Chong Chen
Jiawei Han
VLM
69
51
0
23 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,984
0
20 Apr 2018
1