v1v2v3 (latest)

Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations

12 August 2019

Shigang Li

Tal Ben-Nun

Salvatore Di Girolamo

Dan Alistarh

Torsten Hoefler

ArXiv (abs)PDF HTML

Papers citing "Taming Unbalanced Training Workloads in Deep Learning with Partial Collective Operations"

29 / 29 papers shown

Title
An In-Depth Analysis of the Slingshot Interconnect Daniele De Sensi Salvatore Di Girolamo K. McMahon Duncan Roweth Torsten Hoefler 36 98 0 20 Aug 2020
Priority-based Parameter Propagation for Distributed DNN Training Anand Jayarajan Jinliang Wei Garth A. Gibson Alexandra Fedorova Gennady Pekhimenko AI4CE 55 181 0 10 May 2019
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning Tal Ben-Nun Maciej Besta Simon Huber A. Ziogas D. Peter Torsten Hoefler ELM ALM 54 77 0 29 Jan 2019
Stochastic Gradient Push for Distributed Deep Learning Mahmoud Assran Nicolas Loizou Nicolas Ballas Michael G. Rabbat 79 348 0 27 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,175 0 11 Oct 2018
Exascale Deep Learning for Climate Analytics Thorsten Kurth Sean Treichler Josh Romero M. Mudigonda Nathan Luehr ... Michael A. Matheson J. Deslippe M. Fatica P. Prabhat Michael Houston BDL 70 263 0 03 Oct 2018
The Convergence of Sparsified Gradient Methods Dan Alistarh Torsten Hoefler M. Johansson Sarit Khirirat Nikola Konstantinov Cédric Renggli 169 494 0 27 Sep 2018
CosmoFlow: Using Deep Learning to Learn the Universe at Scale Amrita Mathuriya Deborah Bard P. Mendygral Lawrence Meadows James A. Arnemann ... Nalini Kumar S. Ho Michael F. Ringenburg P. Prabhat Victor W. Lee AI4CE 61 126 0 14 Aug 2018
GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent J. Daily Abhinav Vishnu Charles Siegel T. Warfel Vinay C. Amatya 45 95 0 15 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis Tal Ben-Nun Torsten Hoefler GNN 62 707 0 26 Feb 2018
SparCML: High-Performance Sparse Communication for Machine Learning Cédric Renggli Saleh Ashkboos Mehdi Aghagolzadeh Dan Alistarh Torsten Hoefler 66 127 0 22 Feb 2018
Horovod: fast and easy distributed deep learning in TensorFlow Alexander Sergeev Mike Del Balso 100 1,221 0 15 Feb 2018
Asynchronous Decentralized Parallel Stochastic Gradient Descent Xiangru Lian Wei Zhang Ce Zhang Ji Liu ODL 48 500 0 18 Oct 2017
sPIN: High-performance streaming Processing in the Network Torsten Hoefler Salvatore Di Girolamo Konstantin Taranov Ryan E. Grant R. Brightwell 72 77 0 16 Sep 2017
Improved Regularization of Convolutional Neural Networks with Cutout Terrance Devries Graham W. Taylor 127 3,774 0 15 Aug 2017
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 730 132,363 0 12 Jun 2017
Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent Xiangru Lian Ce Zhang Huan Zhang Cho-Jui Hsieh Wei Zhang Ji Liu 50 1,235 0 25 May 2017
How to scale distributed deep learning? Peter H. Jin Qiaochu Yuan F. Iandola Kurt Keutzer 3DH 62 137 0 14 Nov 2016
Densely Connected Convolutional Networks Gao Huang Zhuang Liu Laurens van der Maaten Kilian Q. Weinberger PINN 3DV 786 36,881 0 25 Aug 2016
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.2K 194,426 0 10 Dec 2015
Rethinking the Inception Architecture for Computer Vision Christian Szegedy Vincent Vanhoucke Sergey Ioffe Jonathon Shlens Z. Wojna 3DV BDL 886 27,416 0 02 Dec 2015
Delving Deeper into Convolutional Networks for Learning Video Representations Nicolas Ballas L. Yao C. Pal Aaron Courville MDE 90 701 0 19 Nov 2015
Staleness-aware Async-SGD for Distributed Deep Learning Wei Zhang Suyog Gupta Xiangru Lian Ji Liu 75 266 0 18 Nov 2015
Model Accuracy and Runtime Tradeoff in Distributed Deep Learning:A Systematic Study Suyog Gupta Wei Zhang Fei Wang 65 172 0 14 Sep 2015
Beyond Short Snippets: Deep Networks for Video Classification Joe Yue-Hei Ng Matthew J. Hausknecht Sudheendra Vijayanarasimhan Oriol Vinyals R. Monga G. Toderici 145 2,338 0 31 Mar 2015
Long-term Recurrent Convolutional Networks for Visual Recognition and Description Jeff Donahue Lisa Anne Hendricks Marcus Rohrbach Subhashini Venugopalan S. Guadarrama Kate Saenko Trevor Darrell VLM 165 6,056 0 17 Nov 2014
Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan Andrew Zisserman FAtt MDE 1.7K 100,508 0 04 Sep 2014
UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild K. Soomro Amir Zamir M. Shah CLIP VGen 160 6,164 0 03 Dec 2012
HOGWILD!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent Feng Niu Benjamin Recht Christopher Ré Stephen J. Wright 201 2,274 0 28 Jun 2011