ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1604.00981
  4. Cited By
Revisiting Distributed Synchronous SGD

Revisiting Distributed Synchronous SGD

4 April 2016
Jianmin Chen
Xinghao Pan
R. Monga
Samy Bengio
Rafal Jozefowicz
ArXivPDFHTML

Papers citing "Revisiting Distributed Synchronous SGD"

27 / 27 papers shown
Title
Understanding Stragglers in Large Model Training Using What-if Analysis
Understanding Stragglers in Large Model Training Using What-if Analysis
Jinkun Lin
Ziheng Jiang
Zuquan Song
Sida Zhao
Menghan Yu
...
Shuguang Wang
Yanghua Peng
Xin Liu
Aurojit Panda
Jinyang Li
68
1
0
09 May 2025
ATA: Adaptive Task Allocation for Efficient Resource Management in Distributed Machine Learning
ATA: Adaptive Task Allocation for Efficient Resource Management in Distributed Machine Learning
Artavazd Maranjyan
El Mehdi Saad
Peter Richtárik
Francesco Orabona
101
0
0
02 Feb 2025
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
Artavazd Maranjyan
Alexander Tyurin
Peter Richtárik
66
3
0
27 Jan 2025
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Distributed Stochastic Gradient Descent with Staleness: A Stochastic Delay Differential Equation Based Framework
Siyuan Yu
Wei Chen
H. V. Poor
63
0
0
17 Jun 2024
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation
Jie Xu
Karthikeyan P. Saravanan
Rogier van Dalen
Haaris Mehmood
David Tuckey
Mete Ozay
97
6
0
10 May 2024
Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
  Partitioned Edge Learning
Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning
Dingzhu Wen
M. Bennis
Kaibin Huang
40
48
0
10 Mar 2020
Large Batch Training of Convolutional Networks
Large Batch Training of Convolutional Networks
Yang You
Igor Gitman
Boris Ginsburg
ODL
95
842
0
13 Aug 2017
Collective Robot Reinforcement Learning with Distributed Asynchronous
  Guided Policy Search
Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search
Ali Yahya
A. Li
Mrinal Kalakrishnan
Yevgen Chebotar
Sergey Levine
OffRL
47
155
0
03 Oct 2016
Asynchronous Stochastic Gradient Descent with Delay Compensation
Asynchronous Stochastic Gradient Descent with Delay Compensation
Shuxin Zheng
Qi Meng
Taifeng Wang
Wei Chen
Nenghai Yu
Zhiming Ma
Tie-Yan Liu
80
313
0
27 Sep 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
350
2,913
0
15 Sep 2016
Conditional Image Generation with PixelCNN Decoders
Conditional Image Generation with PixelCNN Decoders
Aaron van den Oord
Nal Kalchbrenner
Oriol Vinyals
L. Espeholt
Alex Graves
Koray Kavukcuoglu
VLM
120
2,490
0
16 Jun 2016
ASAGA: Asynchronous Parallel SAGA
ASAGA: Asynchronous Parallel SAGA
Rémi Leblond
Fabian Pedregosa
Simon Lacoste-Julien
AI4TS
50
101
0
15 Jun 2016
Distributed Deep Learning Using Synchronous Stochastic Gradient Descent
Distributed Deep Learning Using Synchronous Stochastic Gradient Descent
Dipankar Das
Sasikanth Avancha
Dheevatsa Mudigere
K. Vaidyanathan
Srinivas Sridharan
Dhiraj D. Kalamkar
Bharat Kaul
Pradeep Dubey
GNN
48
170
0
22 Feb 2016
Exploring the Limits of Language Modeling
Exploring the Limits of Language Modeling
Rafal Jozefowicz
Oriol Vinyals
M. Schuster
Noam M. Shazeer
Yonghui Wu
105
1,143
0
07 Feb 2016
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
416
27,231
0
02 Dec 2015
Staleness-aware Async-SGD for Distributed Deep Learning
Staleness-aware Async-SGD for Distributed Deep Learning
Wei Zhang
Suyog Gupta
Xiangru Lian
Ji Liu
50
266
0
18 Nov 2015
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
Perturbed Iterate Analysis for Asynchronous Stochastic Optimization
Horia Mania
Xinghao Pan
Dimitris Papailiopoulos
Benjamin Recht
Kannan Ramchandran
Michael I. Jordan
71
231
0
24 Jul 2015
Splash: User-friendly Programming Interface for Parallelizing Stochastic
  Algorithms
Splash: User-friendly Programming Interface for Parallelizing Stochastic Algorithms
Yuchen Zhang
Michael I. Jordan
40
20
0
24 Jun 2015
On Variance Reduction in Stochastic Gradient Descent and its
  Asynchronous Variants
On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
Sashank J. Reddi
Ahmed S. Hefny
S. Sra
Barnabás Póczós
Alex Smola
96
195
0
23 Jun 2015
Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms
Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms
Christopher De Sa
Ce Zhang
K. Olukotun
Christopher Ré
43
204
0
22 Jun 2015
DRAW: A Recurrent Neural Network For Image Generation
DRAW: A Recurrent Neural Network For Image Generation
Karol Gregor
Ivo Danihelka
Alex Graves
Danilo Jimenez Rezende
Daan Wierstra
GAN
DRL
142
1,959
0
16 Feb 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
279
43,154
0
11 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
678
149,474
0
22 Dec 2014
Deep learning with Elastic Averaging SGD
Deep learning with Elastic Averaging SGD
Sixin Zhang
A. Choromańska
Yann LeCun
FedML
45
609
0
20 Dec 2014
ImageNet Large Scale Visual Recognition Challenge
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
986
39,383
0
01 Sep 2014
One Billion Word Benchmark for Measuring Progress in Statistical
  Language Modeling
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
Ciprian Chelba
Tomas Mikolov
M. Schuster
Qi Ge
T. Brants
P. Koehn
T. Robinson
102
1,099
0
11 Dec 2013
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient
  Descent
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Feng Niu
Benjamin Recht
Christopher Ré
Stephen J. Wright
121
2,272
0
28 Jun 2011
1