ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6651
  4. Cited By
Deep learning with Elastic Averaging SGD

Deep learning with Elastic Averaging SGD

20 December 2014
Sixin Zhang
A. Choromańska
Yann LeCun
    FedML
ArXivPDFHTML

Papers citing "Deep learning with Elastic Averaging SGD"

9 / 9 papers shown
Title
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
Hiroki Naganuma
Xinzhi Zhang
Man-Chung Yue
Ioannis Mitliagkas
Philipp A. Witte
Russell J. Hewett
Yin Tat Lee
126
0
0
25 Apr 2025
No Need to Talk: Asynchronous Mixture of Language Models
No Need to Talk: Asynchronous Mixture of Language Models
Anastasiia Filippova
Angelos Katharopoulos
David Grangier
Ronan Collobert
MoE
54
0
0
04 Oct 2024
PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight Prediction
PipeOptim: Ensuring Effective 1F1B Schedule with Optimizer-Dependent Weight Prediction
Lei Guan
Dongsheng Li
Jiye Liang
Wenjian Wang
Wenjian Wang
Xicheng Lu
56
1
0
01 Dec 2023
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Weighted Averaged Stochastic Gradient Descent: Asymptotic Normality and Optimality
Ziyang Wei
Wanrong Zhu
Wei Biao Wu
44
5
0
13 Jul 2023
Integrated Model, Batch and Domain Parallelism in Training Neural
  Networks
Integrated Model, Batch and Domain Parallelism in Training Neural Networks
A. Gholami
A. Azad
Peter H. Jin
Kurt Keutzer
A. Buluç
58
83
0
12 Dec 2017
Train longer, generalize better: closing the generalization gap in large
  batch training of neural networks
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Elad Hoffer
Itay Hubara
Daniel Soudry
ODL
138
798
0
24 May 2017
Asynchronous Stochastic Gradient Descent with Delay Compensation
Asynchronous Stochastic Gradient Descent with Delay Compensation
Shuxin Zheng
Qi Meng
Taifeng Wang
Wei Chen
Nenghai Yu
Zhiming Ma
Tie-Yan Liu
80
313
0
27 Sep 2016
Distributed Bayesian Learning with Stochastic Natural-gradient
  Expectation Propagation and the Posterior Server
Distributed Bayesian Learning with Stochastic Natural-gradient Expectation Propagation and the Posterior Server
Leonard Hasenclever
Stefan Webb
Thibaut Lienart
Sebastian J. Vollmer
Balaji Lakshminarayanan
Charles Blundell
Yee Whye Teh
BDL
77
70
0
31 Dec 2015
GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network
  Training
GPU Asynchronous Stochastic Gradient Descent to Speed Up Neural Network Training
T. Paine
Hailin Jin
Jianchao Yang
Zhe Lin
Thomas Huang
67
98
0
21 Dec 2013
1