Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.05610
Cited By
Efficient Parallelization Layouts for Large-Scale Distributed Model Training
9 November 2023
Johannes Hagemann
Samuel Weinbach
Konstantin Dobler
Maximilian Schall
Gerard de Melo
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Efficient Parallelization Layouts for Large-Scale Distributed Model Training"
4 / 4 papers shown
Title
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
248
3
0
20 Nov 2024
Stable LM 2 1.6B Technical Report
Marco Bellagente
J. Tow
Dakota Mahan
Duy Phung
Maksym Zhuravinskyi
...
Paulo Rocha
Harry Saini
H. Teufel
Niccoló Zanichelli
Carlos Riquelme
OSLM
49
52
0
27 Feb 2024
Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Ofir Press
Noah A. Smith
M. Lewis
253
698
0
27 Aug 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,826
0
17 Sep 2019
1