ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05799
  4. Cited By
Horovod: fast and easy distributed deep learning in TensorFlow
v1v2v3 (latest)

Horovod: fast and easy distributed deep learning in TensorFlow

15 February 2018
Alexander Sergeev
Mike Del Balso
ArXiv (abs)PDFHTMLGithub (14494★)

Papers citing "Horovod: fast and easy distributed deep learning in TensorFlow"

50 / 454 papers shown
Title
Hardware/Software Co-Exploration of Neural Architectures
Hardware/Software Co-Exploration of Neural Architectures
Weiwen Jiang
Lei Yang
E. Sha
Qingfeng Zhuge
Shouzhen Gu
Sakyasingha Dasgupta
Yiyu Shi
Jiaxi Hu
104
132
0
06 Jul 2019
Faster Distributed Deep Net Training: Computation and Communication
  Decoupled Stochastic Gradient Descent
Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Shuheng Shen
Linli Xu
Jingchang Liu
Xianfeng Liang
Yifei Cheng
ODLFedML
68
24
0
28 Jun 2019
Fully Decoupled Neural Network Learning Using Delayed Gradients
Fully Decoupled Neural Network Learning Using Delayed Gradients
Huiping Zhuang
Yi Wang
Qinglai Liu
Shuai Zhang
Zhiping Lin
FedML
83
31
0
21 Jun 2019
Deep Leakage from Gradients
Deep Leakage from Gradients
Ligeng Zhu
Zhijian Liu
Song Han
FedML
114
2,242
0
21 Jun 2019
Predicting Motion of Vulnerable Road Users using High-Definition Maps
  and Efficient ConvNets
Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets
Fang-Chieh Chou
Tsung-Han Lin
Henggang Cui
Vladan Radosavljevic
Thi Nguyen
Tzu-Kuo Huang
Matthew Niedoba
J. Schneider
Nemanja Djuric
58
59
0
20 Jun 2019
Margin Matters: Towards More Discriminative Deep Neural Network
  Embeddings for Speaker Recognition
Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition
Xu Xiang
Shuai Wang
Houjun Huang
Y. Qian
Kai Yu
DRL
77
145
0
18 Jun 2019
High-Performance Deep Learning via a Single Building Block
High-Performance Deep Learning via a Single Building Block
E. Georganas
K. Banerjee
Dhiraj D. Kalamkar
Sasikanth Avancha
Anand Venkat
Michael J. Anderson
G. Henry
Hans Pabst
A. Heinecke
43
12
0
15 Jun 2019
Layered SGD: A Decentralized and Synchronous SGD Algorithm for Scalable
  Deep Neural Network Training
Layered SGD: A Decentralized and Synchronous SGD Algorithm for Scalable Deep Neural Network Training
K. Yu
Thomas Flynn
Shinjae Yoo
N. DÍmperio
OffRL
58
6
0
13 Jun 2019
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification,
  and Local Computations
Qsparse-local-SGD: Distributed SGD with Quantization, Sparsification, and Local Computations
Debraj Basu
Deepesh Data
C. Karakuş
Suhas Diggavi
MQ
74
408
0
06 Jun 2019
Leader Stochastic Gradient Descent for Distributed Training of Deep
  Learning Models: Extension
Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension
Yunfei Teng
Wenbo Gao
F. Chalus
A. Choromańska
Shiqian Ma
Adrian Weller
142
12
0
24 May 2019
Deploying AI Frameworks on Secure HPC Systems with Containers
Deploying AI Frameworks on Secure HPC Systems with Containers
D. Brayford
S. Vallecorsa
Atanas Z. Atanasov
F. Baruffa
Walter Riviera
32
18
0
24 May 2019
Precipitation Nowcasting with Satellite Imagery
Precipitation Nowcasting with Satellite Imagery
V. Lebedev
V. Ivashkin
Irina Rudenko
A. Ganshin
A. Molchanov
Sergey Ovcharenko
Ruslan Grokhovetskiy
Ivan Bushmarinov
D. Solomentsev
AI4Cl
81
71
0
23 May 2019
Scaling Distributed Training of Flood-Filling Networks on HPC
  Infrastructure for Brain Mapping
Scaling Distributed Training of Flood-Filling Networks on HPC Infrastructure for Brain Mapping
Wu Dong
Murat Keçeli
Rafael Vescovi
Hanyu Li
Corey Adams
...
T. Uram
V. Vishwanath
N. Ferrier
B. Kasthuri
P. Littlewood
FedMLAI4CE
40
9
0
13 May 2019
Deep Fusion Network for Image Completion
Deep Fusion Network for Image Completion
Xin Hong
Pengfei Xiong
Renhe Ji
Haoqiang Fan
3DV
77
96
0
17 Apr 2019
ResUNet-a: a deep learning framework for semantic segmentation of
  remotely sensed data
ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data
F. Diakogiannis
F. Waldner
P. Caccetta
Chen Wu
SSeg
155
1,359
0
01 Apr 2019
Scalable Deep Learning on Distributed Infrastructures: Challenges,
  Techniques and Tools
Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques and Tools
R. Mayer
Hans-Arno Jacobsen
GNN
81
193
0
27 Mar 2019
TensorFlow Doing HPC
TensorFlow Doing HPC
Steven W. D. Chien
Stefano Markidis
V. Olshevsky
Yaroslav Bulatov
Erwin Laure
Jeffrey S. Vetter
51
15
0
11 Mar 2019
Optimizing Network Performance for Distributed DNN Training on GPU
  Clusters: ImageNet/AlexNet Training in 1.5 Minutes
Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet/AlexNet Training in 1.5 Minutes
Peng Sun
Wansen Feng
Ruobing Han
Shengen Yan
Yonggang Wen
AI4CE
100
70
0
19 Feb 2019
CodedReduce: A Fast and Robust Framework for Gradient Aggregation in
  Distributed Learning
CodedReduce: A Fast and Robust Framework for Gradient Aggregation in Distributed Learning
Amirhossein Reisizadeh
Saurav Prakash
Ramtin Pedarsani
A. Avestimehr
84
25
0
06 Feb 2019
TF-Replicator: Distributed Machine Learning for Researchers
TF-Replicator: Distributed Machine Learning for Researchers
P. Buchlovsky
David Budden
Dominik Grewe
Chris Jones
John Aslanides
...
Aidan Clark
Sergio Gomez Colmenarejo
Aedan Pope
Fabio Viola
Dan Belov
GNNOffRLAI4CE
81
20
0
01 Feb 2019
A Modular Benchmarking Infrastructure for High-Performance and
  Reproducible Deep Learning
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep Learning
Tal Ben-Nun
Maciej Besta
Simon Huber
A. Ziogas
D. Peter
Torsten Hoefler
ELMALM
69
78
0
29 Jan 2019
Augment your batch: better training with larger batches
Augment your batch: better training with larger batches
Elad Hoffer
Tal Ben-Nun
Itay Hubara
Niv Giladi
Torsten Hoefler
Daniel Soudry
ODL
129
76
0
27 Jan 2019
Accelerated Training for CNN Distributed Deep Learning through Automatic
  Resource-Aware Layer Placement
Accelerated Training for CNN Distributed Deep Learning through Automatic Resource-Aware Layer Placement
Jay H. Park
Sunghwan Kim
Jinwon Lee
Myeongjae Jeon
S. Noh
42
11
0
17 Jan 2019
A Distributed Synchronous SGD Algorithm with Global Top-$k$
  Sparsification for Low Bandwidth Networks
A Distributed Synchronous SGD Algorithm with Global Top-kkk Sparsification for Low Bandwidth Networks
Shaoshuai Shi
Qiang-qiang Wang
Kaiyong Zhao
Zhenheng Tang
Yuxin Wang
Xiang Huang
Xiaowen Chu
90
137
0
14 Jan 2019
UPSNet: A Unified Panoptic Segmentation Network
UPSNet: A Unified Panoptic Segmentation Network
Yuwen Xiong
Renjie Liao
Hengshuang Zhao
Rui Hu
Min Bai
Ersin Yumer
R. Urtasun
SSeg
97
432
0
12 Jan 2019
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU
  Servers
CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers
A. Koliousis
Pijika Watcharapichat
Matthias Weidlich
Kai Zou
Paolo Costa
Peter R. Pietzuch
65
70
0
08 Jan 2019
Stanza: Layer Separation for Distributed Training in Deep Learning
Stanza: Layer Separation for Distributed Training in Deep Learning
Xiaorui Wu
Hongao Xu
Bo Li
Y. Xiong
MoE
61
9
0
27 Dec 2018
Nonlinear Conjugate Gradients For Scaling Synchronous Distributed DNN
  Training
Nonlinear Conjugate Gradients For Scaling Synchronous Distributed DNN Training
Saurabh N. Adya
Vinay Palakkode
Oncel Tuzel
39
4
0
07 Dec 2018
JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of
  Imperative Programs
JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs
Eunji Jeong
Sungwoo Cho
Gyeong-In Yu
Joo Seong Jeong
Dongjin Shin
Byung-Gon Chun
59
25
0
04 Dec 2018
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD
  Algorithms
MG-WFBP: Efficient Data Communication for Distributed Synchronous SGD Algorithms
Shaoshuai Shi
Xiaowen Chu
Bo Li
FedML
91
90
0
27 Nov 2018
Hydra: A Peer to Peer Distributed Training & Data Collection Framework
Hydra: A Peer to Peer Distributed Training & Data Collection Framework
Vaibhav Mathur
K. Chahal
OffRL
35
2
0
24 Nov 2018
A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better
  Generalization
A Simple Non-i.i.d. Sampling Approach for Efficient Training and Better Generalization
Bowen Cheng
Yunchao Wei
Jiahui Yu
Shiyu Chang
Jinjun Xiong
Wen-mei W. Hwu
Thomas S. Huang
Humphrey Shi
OODVLM
113
6
0
23 Nov 2018
Workload-aware Automatic Parallelization for Multi-GPU DNN Training
Workload-aware Automatic Parallelization for Multi-GPU DNN Training
Sungho Shin
Y. Jo
Jungwook Choi
Swagath Venkataramani
Vijayalakshmi Srinivasan
Wonyong Sung
3DH
45
1
0
05 Nov 2018
Democratizing Production-Scale Distributed Deep Learning
Democratizing Production-Scale Distributed Deep Learning
Minghuang Ma
Hadi Pouransari
Daniel Chao
Saurabh N. Adya
S. Serrano
Yi Qin
Dan Gimnicher
Dominic Walsh
MoE
110
6
0
31 Oct 2018
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks
A Hitchhiker's Guide On Distributed Training of Deep Neural Networks
K. Chahal
Manraj Singh Grover
Kuntal Dey
3DHOOD
90
54
0
28 Oct 2018
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI:
  Characterization, Designs, and Performance Evaluation
Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation
A. A. Awan
Jeroen Bédorf
Ching-Hsiang Chu
Hari Subramoni
D. Panda
GNN
61
45
0
25 Oct 2018
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
Michael Schaarschmidt
Sven Mika
Kai Fricke
Eiko Yoneki
OffRL
51
5
0
21 Oct 2018
Fault Tolerance in Iterative-Convergent Machine Learning
Fault Tolerance in Iterative-Convergent Machine Learning
Aurick Qiao
Bryon Aragam
Bingjing Zhang
Eric Xing
76
42
0
17 Oct 2018
GPU-Accelerated Robotic Simulation for Distributed Reinforcement
  Learning
GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning
Jacky Liang
Viktor Makoviychuk
Ankur Handa
N. Chentanez
Miles Macklin
Dieter Fox
AI4CE
99
183
0
12 Oct 2018
Exascale Deep Learning for Climate Analytics
Exascale Deep Learning for Climate Analytics
Thorsten Kurth
Sean Treichler
Josh Romero
M. Mudigonda
Nathan Luehr
...
Michael A. Matheson
J. Deslippe
M. Fatica
P. Prabhat
Michael Houston
BDL
99
264
0
03 Oct 2018
FanStore: Enabling Efficient and Scalable I/O for Distributed Deep
  Learning
FanStore: Enabling Efficient and Scalable I/O for Distributed Deep Learning
Zhao-jie Zhang
Lei Huang
U. Manor
Linjing Fang
G. Merlo
C. Michoski
J. Cazes
N. Gaffney
31
22
0
27 Sep 2018
Towards automated neural design: An open source, distributed neural
  architecture research framework
Towards automated neural design: An open source, distributed neural architecture research framework
George Kyriakides
K. Margaritis
16
4
0
20 Sep 2018
Multimodal Trajectory Predictions for Autonomous Driving using Deep
  Convolutional Networks
Multimodal Trajectory Predictions for Autonomous Driving using Deep Convolutional Networks
Henggang Cui
Vladan Radosavljevic
Fang-Chieh Chou
Tsung-Han Lin
Thi Nguyen
Tzu-Kuo Huang
J. Schneider
Nemanja Djuric
97
618
0
18 Sep 2018
Uncertainty-aware Short-term Motion Prediction of Traffic Actors for
  Autonomous Driving
Uncertainty-aware Short-term Motion Prediction of Traffic Actors for Autonomous Driving
Nemanja Djuric
Vladan Radosavljevic
Henggang Cui
Thi Nguyen
Fang-Chieh Chou
Tsung-Han Lin
Nitin Singh
J. Schneider
109
206
0
17 Aug 2018
Highly Scalable Deep Learning Training System with Mixed-Precision:
  Training ImageNet in Four Minutes
Highly Scalable Deep Learning Training System with Mixed-Precision: Training ImageNet in Four Minutes
Xianyan Jia
Shutao Song
W. He
Yangzihao Wang
Haidong Rong
...
Li Yu
Tiegang Chen
Guangxiao Hu
Shaoshuai Shi
Xiaowen Chu
112
385
0
30 Jul 2018
An argument in favor of strong scaling for deep neural networks with
  small datasets
An argument in favor of strong scaling for deep neural networks with small datasets
R. L. F. Cunha
Eduardo Rodrigues
Matheus Palhares Viana
Dario Augusto Borges Oliveira
60
2
0
24 Jul 2018
Parallel Restarted SGD with Faster Convergence and Less Communication:
  Demystifying Why Model Averaging Works for Deep Learning
Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning
Hao Yu
Sen Yang
Shenghuo Zhu
MoMeFedML
100
611
0
17 Jul 2018
An Intriguing Failing of Convolutional Neural Networks and the CoordConv
  Solution
An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution
Rosanne Liu
Joel Lehman
Piero Molino
F. Such
Eric Frank
Alexander Sergeev
J. Yosinski
122
895
0
09 Jul 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance
  Benchmark
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman
Daniel Kang
Deepak Narayanan
Luigi Nardi
Tian Zhao
Jian Zhang
Peter Bailis
K. Olukotun
Christopher Ré
Matei A. Zaharia
60
117
0
04 Jun 2018
Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq
Mixed-Precision Training for NLP and Speech Recognition with OpenSeq2Seq
Oleksii Kuchaiev
Boris Ginsburg
Igor Gitman
Vitaly Lavrukhin
Jason Chun Lok Li
Huyen Nguyen
Carl Case
Paulius Micikevicius
VLM
72
49
0
25 May 2018
Previous
123...1089
Next