Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1802.05799
Cited By
v1
v2
v3 (latest)
Horovod: fast and easy distributed deep learning in TensorFlow
15 February 2018
Alexander Sergeev
Mike Del Balso
Re-assign community
ArXiv (abs)
PDF
HTML
Github (14494★)
Papers citing
"Horovod: fast and easy distributed deep learning in TensorFlow"
50 / 454 papers shown
Title
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
Aashaka Shah
Vijay Chidambaram
M. Cowan
Saeed Maleki
Madan Musuvathi
Todd Mytkowicz
Jacob Nelson
Olli Saarikivi
Rachee Singh
51
60
0
08 Nov 2021
BlueFog: Make Decentralized Algorithms Practical for Optimization and Deep Learning
Bicheng Ying
Kun Yuan
Hanbin Hu
Yiming Chen
W. Yin
FedML
83
28
0
08 Nov 2021
A System for General In-Hand Object Re-Orientation
Tao Chen
Jie Xu
Pulkit Agrawal
139
258
0
04 Nov 2021
OneFlow: Redesign the Distributed Deep Learning Framework from Scratch
Jinhui Yuan
Xinqi Li
Cheng Cheng
Juncheng Liu
Ran Guo
...
Fei Yang
Xiaodong Yi
Chuan Wu
Haoran Zhang
Jie Zhao
62
41
0
28 Oct 2021
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
Yongbin Li
Hongxin Liu
Zhengda Bian
Boxiang Wang
Haichen Huang
Fan Cui
Chuan-Qing Wang
Yang You
GNN
111
149
0
28 Oct 2021
AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning
Siddharth Singh
A. Bhatele
GNN
97
15
0
25 Oct 2021
Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning
Ningning Xie
Tamara Norman
Dominik Grewe
Dimitrios Vytiniotis
75
17
0
20 Oct 2021
EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks
Shengwei Li
Zhiquan Lai
Dongsheng Li
Yiming Zhang
Xiangyu Ye
Yabo Duan
FedML
63
3
0
18 Oct 2021
Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous Multi-GPU Servers
Yujing Ma
Florin Rusu
Kesheng Wu
A. Sim
102
3
0
13 Oct 2021
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
92
36
0
12 Oct 2021
SWAT Watershed Model Calibration using Deep Learning
M. Mudunuru
K. Son
Pin Jiang
X. Chen
45
2
0
06 Oct 2021
Solon: Communication-efficient Byzantine-resilient Distributed Training via Redundant Gradients
Lingjiao Chen
Leshang Chen
Hongyi Wang
S. Davidson
Yan Sun
FedML
66
1
0
04 Oct 2021
TSM: Temporal Shift Module for Efficient and Scalable Video Understanding on Edge Device
Ji Lin
Chuang Gan
Kuan-Chieh Wang
Song Han
100
65
0
27 Sep 2021
Neural Architecture Search in operational context: a remote sensing case-study
Anthony Cazasnoves
Pierre-Antoine Ganaye
Kévin Sanchis
Tugdual Ceillier
68
0
0
15 Sep 2021
Multilingual Translation via Grafting Pre-trained Language Models
Zewei Sun
Mingxuan Wang
Lei Li
AI4CE
240
22
0
11 Sep 2021
Tolerating Adversarial Attacks and Byzantine Faults in Distributed Machine Learning
Yusen Wu
Hao Chen
Xin Wang
Chao Liu
Phuong Nguyen
Yelena Yesha
AAML
42
5
0
05 Sep 2021
Compressing gradients by exploiting temporal correlation in momentum-SGD
Tharindu B. Adikari
S. Draper
20
0
0
17 Aug 2021
HPTMT Parallel Operators for High Performance Data Science & Data Engineering
V. Abeykoon
Supun Kamburugamuve
Chathura Widanage
Niranda Perera
A. Uyar
Thejaka Amila Kanewala
G. V. Laszewski
Geoffrey C. Fox
AI4TS
31
1
0
13 Aug 2021
HPTMT: Operator-Based Architecture for Scalable High-Performance Data-Intensive Frameworks
Supun Kamburugamuve
Chathura Widanage
Niranda Perera
V. Abeykoon
A. Uyar
Thejaka Amila Kanewala
G. V. Laszewski
Geoffrey C. Fox
34
4
0
27 Jul 2021
MXDAG: A Hybrid Abstraction for Cluster Applications
Weitao Wang
Sushovan Das
X. Wu
Zhuang Wang
Ang Chen
T. Ng
98
1
0
15 Jul 2021
You Do Not Need a Bigger Boat: Recommendations at Reasonable Scale in a (Mostly) Serverless and Open Stack
Jacopo Tagliabue
71
15
0
15 Jul 2021
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
130
138
0
14 Jul 2021
An Efficient DP-SGD Mechanism for Large Scale NLP Models
Christophe Dupuy
Radhika Arava
Rahul Gupta
Anna Rumshisky
SyDa
97
36
0
14 Jul 2021
KAISA: An Adaptive Second-Order Optimizer Framework for Deep Neural Networks
J. G. Pauloski
Qi Huang
Lei Huang
Shivaram Venkataraman
Kyle Chard
Ian Foster
Zhao-jie Zhang
86
29
0
04 Jul 2021
BAGUA: Scaling up Distributed Learning with System Relaxations
Shaoduo Gan
Xiangru Lian
Rui Wang
Jianbin Chang
Chengjun Liu
...
Jiawei Jiang
Binhang Yuan
Sen Yang
Ji Liu
Ce Zhang
88
30
0
03 Jul 2021
ResIST: Layer-Wise Decomposition of ResNets for Distributed Training
Chen Dun
Cameron R. Wolfe
C. Jermaine
Anastasios Kyrillidis
87
21
0
02 Jul 2021
JUWELS Booster -- A Supercomputer for Large-Scale AI Research
Stefan Kesselheim
A. Herten
K. Krajsek
J. Ebert
J. Jitsev
...
A. Strube
Roshni Kamath
Martin G. Schultz
M. Riedel
T. Lippert
GNN
78
16
0
30 Jun 2021
Flare: Flexible In-Network Allreduce
Daniele De Sensi
Salvatore Di Girolamo
Saleh Ashkboos
Shigang Li
Torsten Hoefler
76
42
0
29 Jun 2021
BFTrainer: Low-Cost Training of Neural Networks on Unfillable Supercomputer Nodes
Zhengchun Liu
R. Kettimuthu
M. Papka
Ian Foster
46
3
0
22 Jun 2021
Secure Distributed Training at Scale
Eduard A. Gorbunov
Alexander Borzunov
Michael Diskin
Max Ryabinin
FedML
90
15
0
21 Jun 2021
Distributed Deep Learning in Open Collaborations
Michael Diskin
Alexey Bukhtiyarov
Max Ryabinin
Lucile Saulnier
Quentin Lhoest
...
Denis Mazur
Ilia Kobelev
Yacine Jernite
Thomas Wolf
Gennady Pekhimenko
FedML
129
59
0
18 Jun 2021
Dynamic Gradient Aggregation for Federated Domain Adaptation
Dimitrios Dimitriadis
K. Kumatani
R. Gmyr
Yashesh Gaur
Sefik Emre Eskimez
FedML
72
5
0
14 Jun 2021
Efficient and Less Centralized Federated Learning
Li Chou
Zichang Liu
Zhuang Wang
Anshumali Shrivastava
FedML
64
19
0
11 Jun 2021
Communication-efficient SGD: From Local SGD to One-Shot Averaging
Artin Spiridonoff
Alexander Olshevsky
I. Paschalidis
FedML
110
20
0
09 Jun 2021
MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Ming Zhou
Bo Liu
Hanjing Wang
Muning Wen
Runzhe Wu
Ying Wen
Yaodong Yang
Weinan Zhang
Jun Wang
OffRL
61
49
0
05 Jun 2021
Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images
Mehdi Cherti
J. Jitsev
LM&MA
100
24
0
31 May 2021
Bridging Data Center AI Systems with Edge Computing for Actionable Information Retrieval
Zhengchun Liu
Ahsan Ali
Peter Kenesei
Antonino Miceli
Hemant Sharma
...
Naoufal Layad
Jana Thayer
R. Herbst
Chun Hong Yoon
Ian Foster
58
23
0
28 May 2021
Towards Quantized Model Parallelism for Graph-Augmented MLPs Based on Gradient-Free ADMM Framework
Junxiang Wang
Hongyi Li
Zheng Chai
Yongchao Wang
Yue Cheng
Liang Zhao
MQ
48
3
0
20 May 2021
Compressed Communication for Distributed Training: Adaptive Methods and System
Yuchen Zhong
Cong Xie
Shuai Zheng
Yanghua Peng
74
9
0
17 May 2021
Extracting Variable-Depth Logical Document Hierarchy from Long Documents: Method, Evaluation, and Application
Rongyu Cao
Yixuan Cao
Ganbin Zhou
Ping Luo
26
4
0
14 May 2021
Breaking the Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads
Abhinav Jangda
Jun Huang
Guodong Liu
Amir Hossein Nodehi Sabet
Saeed Maleki
Youshan Miao
Madan Musuvathi
Todd Mytkowicz
Olli Saarikivi
77
64
0
12 May 2021
Deep Learning Hamiltonian Monte Carlo
Sam Foreman
Xiao-Yong Jin
James C. Osborn
50
16
0
07 May 2021
Distributed Multigrid Neural Solvers on Megavoxel Domains
Aditya Balu
Sergio Botelho
Biswajit Khara
Vinay Rao
Chinmay Hegde
Soumik Sarkar
Santi S. Adavani
A. Krishnamurthy
Baskar Ganapathysubramanian
AI4CE
110
11
0
29 Apr 2021
NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization
Ali Ramezani-Kebrya
Fartash Faghri
Ilya Markov
V. Aksenov
Dan Alistarh
Daniel M. Roy
MQ
111
33
0
28 Apr 2021
End-to-End Jet Classification of Boosted Top Quarks with the CMS Open Data
Michael Andrews
Bjorn Burkle
Yi-fan Chen
Davide DiCroce
S. Gleyzer
...
N. Pervan
Yusef Shafi
Wei-Ju Sun
Emanuele Usai
Kun Yang
59
10
0
19 Apr 2021
ScaleFreeCTR: MixCache-based Distributed Training System for CTR Models with Huge Embedding Table
Huifeng Guo
Wei Guo
Yong Gao
Ruiming Tang
Xiuqiang He
Wenzhi Liu
87
21
0
17 Apr 2021
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Dheevatsa Mudigere
Y. Hao
Jianyu Huang
Zhihao Jia
Andrew Tulloch
...
Ajit Mathews
Lin Qiao
M. Smelyanskiy
Bill Jia
Vijay Rao
113
155
0
12 Apr 2021
A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning
S. Akintoye
Liangxiu Han
Xin Zhang
Haoming Chen
Daoqiang Zhang
101
15
0
11 Apr 2021
High-Throughput Virtual Screening of Small Molecule Inhibitors for SARS-CoV-2 Protein Targets with Deep Fusion Models
Garrett A. Stevenson
Derek Jones
Hyojin Kim
W. D. Bennett
B. Bennion
...
A. Zemla
Xiaohua Zhang
Fangqiang Zhu
F. Lightstone
Jonathan E. Allen
41
17
0
09 Apr 2021
Librispeech Transducer Model with Internal Language Model Prior Correction
Albert Zeyer
André Merboldt
Wilfried Michel
Ralf Schluter
Hermann Ney
68
30
0
07 Apr 2021
Previous
1
2
3
4
5
6
...
8
9
10
Next