Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.12562
Cited By
On the Transformer Growth for Progressive BERT Training
23 October 2020
Xiaotao Gu
Liyuan Liu
Hongkun Yu
Jing Li
C. L. P. Chen
Jiawei Han
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Transformer Growth for Progressive BERT Training"
13 / 13 papers shown
Title
A multilevel approach to accelerate the training of Transformers
Guillaume Lauga
Maël Chaumette
Edgar Desainte-Maréville
Étienne Lasalle
Arthur Lebeurrier
AI4CE
40
0
0
24 Apr 2025
Stacking as Accelerated Gradient Descent
Naman Agarwal
Pranjal Awasthi
Satyen Kale
Eric Zhao
ODL
67
2
0
20 Feb 2025
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
Chenze Shao
Fandong Meng
Jie Zhou
46
1
0
17 Jul 2024
A Multi-Level Framework for Accelerating Training Transformer Models
Longwei Zou
Han Zhang
Yangdong Deng
AI4CE
32
1
0
07 Apr 2024
FLM-101B: An Open LLM and How to Train It with
100
K
B
u
d
g
e
t
100K Budget
100
K
B
u
d
g
e
t
Xiang Li
Yiqun Yao
Xin Jiang
Xuezhi Fang
Xuying Meng
...
LI DU
Bowen Qin
Zheng-Wei Zhang
Aixin Sun
Yequan Wang
55
21
0
07 Sep 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
40
0
07 Apr 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
25
47
0
02 Feb 2023
Multi-CLS BERT: An Efficient Alternative to Traditional Ensembling
Haw-Shiuan Chang
Ruei-Yao Sun
Kathryn Ricci
Andrew McCallum
43
14
0
10 Oct 2022
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
23
46
0
28 Mar 2022
ELLE: Efficient Lifelong Pre-training for Emerging Data
Yujia Qin
Jiajie Zhang
Yankai Lin
Zhiyuan Liu
Peng Li
Maosong Sun
Jie Zhou
19
67
0
12 Mar 2022
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
22
59
0
14 Oct 2021
UCPhrase: Unsupervised Context-aware Quality Phrase Tagging
Xiaotao Gu
Zihan Wang
Zhenyu Bi
Yu Meng
Liyuan Liu
Jiawei Han
Jingbo Shang
74
36
0
28 May 2021
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,956
0
20 Apr 2018
1