Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1908.04207
Cited By
v1
v2
v3 (latest)
Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations
12 August 2019
Shigang Li
Tal Ben-Nun
Salvatore Di Girolamo
Dan Alistarh
Torsten Hoefler
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations"
29 / 29 papers shown
Title
An In-Depth Analysis of the Slingshot Interconnect
Daniele De Sensi
Salvatore Di Girolamo
K. McMahon
Duncan Roweth
Torsten Hoefler
36
98
0
20 Aug 2020
Priority-based Parameter Propagation for Distributed DNN Training
Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
AI4CE
55
181
0
10 May 2019
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning
Tal Ben-Nun
Maciej Besta
Simon Huber
A. Ziogas
D. Peter
Torsten Hoefler
ELM
ALM
54
77
0
29 Jan 2019
Stochastic Gradient Push for Distributed Deep Learning
Mahmoud Assran
Nicolas Loizou
Nicolas Ballas
Michael G. Rabbat
79
348
0
27 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,175
0
11 Oct 2018
Exascale Deep Learning for Climate Analytics
Thorsten Kurth
Sean Treichler
Josh Romero
M. Mudigonda
Nathan Luehr
...
Michael A. Matheson
J. Deslippe
M. Fatica
P. Prabhat
Michael Houston
BDL
70
263
0
03 Oct 2018
The Convergence of Sparsified Gradient Methods
Dan Alistarh
Torsten Hoefler
M. Johansson
Sarit Khirirat
Nikola Konstantinov
Cédric Renggli
169
494
0
27 Sep 2018
CosmoFlow: Using Deep Learning to Learn the Universe at Scale
Amrita Mathuriya
Deborah Bard
P. Mendygral
Lawrence Meadows
James A. Arnemann
...
Nalini Kumar
S. Ho
Michael F. Ringenburg
P. Prabhat
Victor W. Lee
AI4CE
61
126
0
14 Aug 2018
GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent
J. Daily
Abhinav Vishnu
Charles Siegel
T. Warfel
Vinay C. Amatya
45
95
0
15 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
62
707
0
26 Feb 2018
SparCML: High-Performance Sparse Communication for Machine Learning
Cédric Renggli
Saleh Ashkboos
Mehdi Aghagolzadeh
Dan Alistarh
Torsten Hoefler
66
127
0
22 Feb 2018
Horovod: fast and easy distributed deep learning in TensorFlow
Alexander Sergeev
Mike Del Balso
100
1,221
0
15 Feb 2018
Asynchronous Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Wei Zhang
Ce Zhang
Ji Liu
ODL
48
500
0
18 Oct 2017
sPIN: High-performance streaming Processing in the Network
Torsten Hoefler
Salvatore Di Girolamo
Konstantin Taranov
Ryan E. Grant
R. Brightwell
72
77
0
16 Sep 2017
Improved Regularization of Convolutional Neural Networks with Cutout
Terrance Devries
Graham W. Taylor
127
3,774
0
15 Aug 2017
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
730
132,363
0
12 Jun 2017
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent
Xiangru Lian
Ce Zhang
Huan Zhang
Cho-Jui Hsieh
Wei Zhang
Ji Liu
50
1,235
0
25 May 2017
How to scale distributed deep learning?
Peter H. Jin
Qiaochu Yuan
F. Iandola
Kurt Keutzer
3DH
62
137
0
14 Nov 2016
Densely Connected Convolutional Networks
Gao Huang
Zhuang Liu
Laurens van der Maaten
Kilian Q. Weinberger
PINN
3DV
786
36,881
0
25 Aug 2016
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
2.2K
194,426
0
10 Dec 2015
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy
Vincent Vanhoucke
Sergey Ioffe
Jonathon Shlens
Z. Wojna
3DV
BDL
886
27,416
0
02 Dec 2015
Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas
L. Yao
C. Pal
Aaron Courville
MDE
90
701
0
19 Nov 2015
Staleness-aware Async-SGD for Distributed Deep Learning
Wei Zhang
Suyog Gupta
Xiangru Lian
Ji Liu
75
266
0
18 Nov 2015
Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A Systematic Study
Suyog Gupta
Wei Zhang
Fei Wang
65
172
0
14 Sep 2015
Beyond Short Snippets: Deep Networks for Video Classification
Joe Yue-Hei Ng
Matthew J. Hausknecht
Sudheendra Vijayanarasimhan
Oriol Vinyals
R. Monga
G. Toderici
145
2,338
0
31 Mar 2015
Long-term Recurrent Convolutional Networks for Visual Recognition and Description
Jeff Donahue
Lisa Anne Hendricks
Marcus Rohrbach
Subhashini Venugopalan
S. Guadarrama
Kate Saenko
Trevor Darrell
VLM
165
6,056
0
17 Nov 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan
Andrew Zisserman
FAtt
MDE
1.7K
100,508
0
04 Sep 2014
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild
K. Soomro
Amir Zamir
M. Shah
CLIP
VGen
160
6,164
0
03 Dec 2012
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Feng Niu
Benjamin Recht
Christopher Ré
Stephen J. Wright
201
2,274
0
28 Jun 2011
1