ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2008.00177
  4. Cited By
Multi-node Bert-pretraining: Cost-efficient Approach

Multi-node Bert-pretraining: Cost-efficient Approach

1 August 2020
Jiahuang Lin
Xuelong Li
Gennady Pekhimenko
ArXiv (abs)PDFHTML

Papers citing "Multi-node Bert-pretraining: Cost-efficient Approach"

5 / 5 papers shown
Title
Unit Scaling: Out-of-the-Box Low-Precision Training
Unit Scaling: Out-of-the-Box Low-Precision Training
Charlie Blake
Douglas Orr
Carlo Luschi
MQ
64
7
0
20 Mar 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly
  Communication-Efficient
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
111
38
0
27 Jan 2023
LV-BERT: Exploiting Layer Variety for BERT
LV-BERT: Exploiting Layer Variety for BERT
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
49
0
0
22 Jun 2021
Distributed Deep Learning in Open Collaborations
Distributed Deep Learning in Open Collaborations
Michael Diskin
Alexey Bukhtiyarov
Max Ryabinin
Lucile Saulnier
Quentin Lhoest
...
Denis Mazur
Ilia Kobelev
Yacine Jernite
Thomas Wolf
Gennady Pekhimenko
FedML
129
59
0
18 Jun 2021
Moshpit SGD: Communication-Efficient Decentralized Training on
  Heterogeneous Unreliable Devices
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
133
35
0
04 Mar 2021
1