ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1908.04207
  4. Cited By
Taming Unbalanced Training Workloads in Deep Learning with Partial
  Collective Operations
v1v2v3 (latest)

Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations

12 August 2019
Shigang Li
Tal Ben-Nun
Salvatore Di Girolamo
Dan Alistarh
Torsten Hoefler
ArXiv (abs)PDFHTML

Papers citing "Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations"

29 / 29 papers shown
Title
An In-Depth Analysis of the Slingshot Interconnect
An In-Depth Analysis of the Slingshot Interconnect
Daniele De Sensi
Salvatore Di Girolamo
K. McMahon
Duncan Roweth
Torsten Hoefler
36
98
0
20 Aug 2020
Priority-based Parameter Propagation for Distributed DNN Training
Priority-based Parameter Propagation for Distributed DNN Training
Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
AI4CE
55
181
0
10 May 2019
A Modular Benchmarking Infrastructure for High-Performance and
  Reproducible Deep Learning
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning
Tal Ben-Nun
Maciej Besta
Simon Huber
A. Ziogas
D. Peter
Torsten Hoefler
ELMALM
54
77
0
29 Jan 2019
Stochastic Gradient Push for Distributed Deep Learning
Stochastic Gradient Push for Distributed Deep Learning
Mahmoud Assran
Nicolas Loizou
Nicolas Ballas
Michael G. Rabbat
79
348
0
27 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,175
0
11 Oct 2018
Exascale Deep Learning for Climate Analytics
Exascale Deep Learning for Climate Analytics
Thorsten Kurth
Sean Treichler
Josh Romero
M. Mudigonda
Nathan Luehr
...
Michael A. Matheson
J. Deslippe
M. Fatica
P. Prabhat
Michael Houston
BDL
70
263
0
03 Oct 2018
The Convergence of Sparsified Gradient Methods
The Convergence of Sparsified Gradient Methods
Dan Alistarh
Torsten Hoefler
M. Johansson
Sarit Khirirat
Nikola Konstantinov
Cédric Renggli
169
494
0
27 Sep 2018
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya
Deborah Bard
P. Mendygral
Lawrence Meadows
James A. Arnemann
...
Nalini Kumar
S. Ho
Michael F. Ringenburg
P. Prabhat
Victor W. Lee
AI4CE
61
126
0
14 Aug 2018
GossipGraD: Scalable Deep Learning using Gossip Communication based
  Asynchronous Gradient Descent
GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent
J. Daily
Abhinav Vishnu
Charles Siegel
T. Warfel
Vinay C. Amatya
45
95
0
15 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth
  Concurrency Analysis
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
62
707
0
26 Feb 2018
SparCML: High-Performance Sparse Communication for Machine Learning
SparCML: High-Performance Sparse Communication for Machine Learning
Cédric Renggli
Saleh Ashkboos
Mehdi Aghagolzadeh
Dan Alistarh
Torsten Hoefler
66
127
0
22 Feb 2018
Horovod: fast and easy distributed deep learning in TensorFlow
Horovod: fast and easy distributed deep learning in TensorFlow
Alexander Sergeev
Mike Del Balso
100
1,221
0
15 Feb 2018
Asynchronous Decentralized Parallel Stochastic Gradient Descent
Asynchronous Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Wei Zhang
Ce Zhang
Ji Liu
ODL
48
500
0
18 Oct 2017
sPIN: High-performance streaming Processing in the Network
sPIN: High-performance streaming Processing in the Network
Torsten Hoefler
Salvatore Di Girolamo
Konstantin Taranov
Ryan E. Grant
R. Brightwell
72
77
0
16 Sep 2017
Improved Regularization of Convolutional Neural Networks with Cutout
Improved Regularization of Convolutional Neural Networks with Cutout
Terrance Devries
Graham W. Taylor
127
3,774
0
15 Aug 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
730
132,363
0
12 Jun 2017
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case
  Study for Decentralized Parallel Stochastic Gradient Descent
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Ce Zhang
Huan Zhang
Cho-Jui Hsieh
Wei Zhang
Ji Liu
50
1,235
0
25 May 2017
How to scale distributed deep learning?
How to scale distributed deep learning?
Peter H. Jin
Qiaochu Yuan
F. Iandola
Kurt Keutzer
3DH
62
137
0
14 Nov 2016
Densely Connected Convolutional Networks
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN3DV
786
36,881
0
25 Aug 2016
Deep Residual Learning for Image Recognition
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,426
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DVBDL
886
27,416
0
02 Dec 2015
Delving Deeper into Convolutional Networks for Learning Video
  Representations
Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas
L. Yao
C. Pal
Aaron Courville
MDE
90
701
0
19 Nov 2015
Staleness-aware Async-SGD for Distributed Deep Learning
Staleness-aware Async-SGD for Distributed Deep Learning
Wei Zhang
Suyog Gupta
Xiangru Lian
Ji Liu
75
266
0
18 Nov 2015
Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A
  Systematic Study
Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A Systematic Study
Suyog Gupta
Wei Zhang
Fei Wang
65
172
0
14 Sep 2015
Beyond Short Snippets: Deep Networks for Video Classification
Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng
Matthew J. Hausknecht
Sudheendra Vijayanarasimhan
Oriol Vinyals
R. Monga
G. Toderici
145
2,338
0
31 Mar 2015
Long-term Recurrent Convolutional Networks for Visual Recognition and
  Description
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
165
6,056
0
17 Nov 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAttMDE
1.7K
100,508
0
04 Sep 2014
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
K. Soomro
Amir Zamir
M. Shah
CLIPVGen
160
6,164
0
03 Dec 2012
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient
  Descent
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Feng Niu
Benjamin Recht
Christopher Ré
Stephen J. Wright
201
2,274
0
28 Jun 2011
1