Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.00576
Cited By
GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length
1 October 2023
Hongye Jin
Xiaotian Han
Jingfeng Yang
Zhimeng Jiang
Chia-Yuan Chang
Xia Hu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"GrowLength: Accelerating LLMs Pretraining by Progressively Growing Training Length"
11 / 11 papers shown
Title
SkyLadder: Better and Faster Pretraining via Context Window Scheduling
Tongyao Zhu
Qian Liu
Haonan Wang
Shiqi Chen
Xiangming Gu
Tianyu Pang
Min-Yen Kan
49
0
0
19 Mar 2025
Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
Hadi Pouransari
Chun-Liang Li
Jen-Hao Rick Chang
Pavan Kumar Anasosalu Vasu
Cem Koc
Vaishaal Shankar
Oncel Tuzel
42
9
0
08 Jan 2025
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
245
0
0
30 Dec 2024
Correlation-Aware Select and Merge Attention for Efficient Fine-Tuning and Context Length Extension
Ning Wang
Zekun Li
Tongxin Bai
Guoqi Li
37
0
0
05 Oct 2024
Achieving Peak Performance for Large Language Models: A Systematic Review
Z. R. K. Rostam
Sándor Szénási
Gábor Kertész
45
3
0
07 Sep 2024
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu
Wilson Yan
Matei A. Zaharia
Pieter Abbeel
VGen
44
64
0
13 Feb 2024
The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey
Saurav Pawar
S.M. Towhidul Islam Tonmoy
S. M. M. Zaman
Vinija Jain
Aman Chadha
Amitava Das
42
28
0
15 Jan 2024
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Xia Hu
LM&MA
139
629
0
26 Apr 2023
Instruction Tuning with GPT-4
Baolin Peng
Chunyuan Li
Pengcheng He
Michel Galley
Jianfeng Gao
SyDa
ALM
LM&MA
171
591
0
06 Apr 2023
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
710
0
27 Aug 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,836
0
17 Sep 2019
1