Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1412.6651
Cited By
Deep learning with Elastic Averaging SGD
20 December 2014
Sixin Zhang
A. Choromańska
Yann LeCun
FedML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep learning with Elastic Averaging SGD"
9 / 9 papers shown
Title
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Hiroki Naganuma
Xinzhi Zhang
Man-Chung Yue
Ioannis Mitliagkas
Philipp A. Witte
Russell J. Hewett
Yin Tat Lee
126
0
0
25 Apr 2025
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
54
0
0
04 Oct 2024
PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight Prediction
Lei Guan
Dongsheng Li
Jiye Liang
Wenjian Wang
Wenjian Wang
Xicheng Lu
56
1
0
01 Dec 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
Wei Biao Wu
44
5
0
13 Jul 2023
Integrated Model, Batch and Domain Parallelism in Training Neural Networks
A. Gholami
A. Azad
Peter H. Jin
Kurt Keutzer
A. Buluç
58
83
0
12 Dec 2017
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
138
798
0
24 May 2017
Asynchronous Stochastic Gradient Descent with Delay Compensation
Shuxin Zheng
Qi Meng
Taifeng Wang
Wei Chen
Nenghai Yu
Zhiming Ma
Tie-Yan Liu
80
313
0
27 Sep 2016
Distributed Bayesian Learning with Stochastic Natural-gradient Expectation Propagation and the Posterior Server
Leonard Hasenclever
Stefan Webb
Thibaut Lienart
Sebastian J. Vollmer
Balaji Lakshminarayanan
Charles Blundell
Yee Whye Teh
BDL
77
70
0
31 Dec 2015
GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training
T. Paine
Hailin Jin
Jianchao Yang
Zhe Lin
Thomas Huang
67
98
0
21 Dec 2013
1