Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2002.04013
Cited By
Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts
10 February 2020
Max Ryabinin
Anton I. Gusev
FedML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts"
14 / 14 papers shown
Title
Nesterov Method for Asynchronous Pipeline Parallel Optimization
Thalaiyasingam Ajanthan
Sameera Ramasinghe
Yan Zuo
Gil Avraham
Alexander Long
24
0
0
02 May 2025
DiPaCo: Distributed Path Composition
Arthur Douillard
Qixuang Feng
Andrei A. Rusu
A. Kuncoro
Yani Donchev
Rachita Chhaparia
Ionel Gog
MarcÁurelio Ranzato
Jiajun Shen
Arthur Szlam
MoE
48
2
0
15 Mar 2024
Video Relationship Detection Using Mixture of Experts
A. Shaabana
Zahra Gharaee
Paul Fieguth
30
1
0
06 Mar 2024
Social Interpretable Reinforcement Learning
Leonardo Lucio Custode
Giovanni Iacca
OffRL
40
2
0
27 Jan 2024
A Language Model of Java Methods with Train/Test Deduplication
Chia-Yi Su
Aakash Bansal
Vijayanta Jain
S. Ghanavati
Collin McMillan
SyDa
VLM
26
9
0
15 May 2023
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin
Tim Dettmers
Michael Diskin
Alexander Borzunov
MoE
30
31
0
27 Jan 2023
Decentralized Training of Foundation Models in Heterogeneous Environments
Binhang Yuan
Yongjun He
Jared Davis
Tianyi Zhang
Tri Dao
Beidi Chen
Percy Liang
Christopher Ré
Ce Zhang
25
90
0
02 Jun 2022
Mixed SIGNals: Sign Language Production via a Mixture of Motion Primitives
Ben Saunders
Necati Cihan Camgöz
Richard Bowden
SLR
27
50
0
23 Jul 2021
Secure Distributed Training at Scale
Eduard A. Gorbunov
Alexander Borzunov
Michael Diskin
Max Ryabinin
FedML
21
15
0
21 Jun 2021
Distributed Deep Learning Using Volunteer Computing-Like Paradigm
Medha Atre
B. Jha
Ashwini Rao
18
11
0
16 Mar 2021
Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin
Eduard A. Gorbunov
Vsevolod Plokhotnyuk
Gennady Pekhimenko
35
31
0
04 Mar 2021
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
240
4,469
0
23 Jan 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,821
0
17 Sep 2019
Analyzing Federated Learning through an Adversarial Lens
A. Bhagoji
Supriyo Chakraborty
Prateek Mittal
S. Calo
FedML
182
1,032
0
29 Nov 2018
1