ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.05799
  4. Cited By
Horovod: fast and easy distributed deep learning in TensorFlow
v1v2v3 (latest)

Horovod: fast and easy distributed deep learning in TensorFlow

15 February 2018
Alexander Sergeev
Mike Del Balso
ArXiv (abs)PDFHTMLGithub (14494★)

Papers citing "Horovod: fast and easy distributed deep learning in TensorFlow"

50 / 454 papers shown
Title
Training DNN Models over Heterogeneous Clusters with Optimal Performance
Training DNN Models over Heterogeneous Clusters with Optimal Performance
Chengyi Nie
Jessica Maghakian
Zhenhua Liu
35
0
0
07 Feb 2024
Breaking MLPerf Training: A Case Study on Optimizing BERT
Breaking MLPerf Training: A Case Study on Optimizing BERT
Yongdeok Kim
Jaehyung Ahn
Myeongwoo Kim
Changin Choi
Heejae Kim
...
Xiongzhan Linghu
Jingkun Ma
Lin Chen
Yuehua Dai
Sungjoo Yoo
54
0
0
04 Feb 2024
InternEvo: Efficient Long-sequence Large Language Model Training via
  Hybrid Parallelism and Redundant Sharding
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
Qiaoling Chen
Diandian Gu
Guoteng Wang
Xun Chen
Yingtong Xiong
...
Qi Hu
Xin Jin
Yonggang Wen
Tianwei Zhang
Peng Sun
101
8
0
17 Jan 2024
Activations and Gradients Compression for Model-Parallel Training
Activations and Gradients Compression for Model-Parallel Training
Mikhail Rudakov
Aleksandr Beznosikov
Yaroslav Kholodov
Alexander Gasnikov
93
2
0
15 Jan 2024
G-Meta: Distributed Meta Learning in GPU Clusters for Large-Scale
  Recommender Systems
G-Meta: Distributed Meta Learning in GPU Clusters for Large-Scale Recommender Systems
Youshao Xiao
Shangchun Zhao
Zhenglei Zhou
Zhaoxin Huan
Lin Ju
Xiaolu Zhang
Lin Wang
Jun Zhou
OffRL
103
8
0
09 Jan 2024
Ravnest: Decentralized Asynchronous Training on Heterogeneous Devices
Ravnest: Decentralized Asynchronous Training on Heterogeneous Devices
A. Menon
Unnikrishnan Menon
Kailash Ahirwar
60
1
0
03 Jan 2024
On the Burstiness of Distributed Machine Learning Traffic
On the Burstiness of Distributed Machine Learning Traffic
Natchanon Luangsomboon
Fahimeh Fazel
Jorg Liebeherr
A. Sobhani
Shichao Guan
Xingjun Chu
65
2
0
30 Dec 2023
An Adaptive Placement and Parallelism Framework for Accelerating RLHF
  Training
An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training
Youshao Xiao
Weichang Wu
Zhenglei Zhou
Fagui Mao
Shangchun Zhao
Lin Ju
Lei Liang
Xiaolu Zhang
Jun Zhou
83
6
0
19 Dec 2023
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable
  Tensor Collections
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections
Marcel Wagenlander
Guo Li
Bo Zhao
Kai Zou
Peter R. Pietzuch
96
7
0
08 Dec 2023
Moirai: Towards Optimal Placement for Distributed Inference on
  Heterogeneous Devices
Moirai: Towards Optimal Placement for Distributed Inference on Heterogeneous Devices
Beibei Zhang
Hongwei Zhu
Feng Gao
Zhihui Yang
Xiaoyang Sean Wang
108
1
0
07 Dec 2023
Holmes: Towards Distributed Training Across Clusters with Heterogeneous
  NIC Environment
Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment
Fei Yang
Shuang Peng
Ning Sun
Fangyu Wang
Ke Tan
Fu Wu
Jiezhong Qiu
Aimin Pan
82
5
0
06 Dec 2023
The Landscape of Modern Machine Learning: A Review of Machine,
  Distributed and Federated Learning
The Landscape of Modern Machine Learning: A Review of Machine, Distributed and Federated Learning
Omer Subasi
Oceane Bel
Joseph Manzano
Kevin J. Barker
FedMLOODPINN
96
2
0
05 Dec 2023
PAUNet: Precipitation Attention-based U-Net for rain prediction from
  satellite radiance data
PAUNet: Precipitation Attention-based U-Net for rain prediction from satellite radiance data
P. J. Reddy
Harish Baki
Sandeep Chinta
Richard Matear
John Taylor
52
0
0
30 Nov 2023
Near-Linear Scaling Data Parallel Training with Overlapping-Aware
  Gradient Compression
Near-Linear Scaling Data Parallel Training with Overlapping-Aware Gradient Compression
Lin Meng
Yuzhong Sun
Weimin Li
77
1
0
08 Nov 2023
RTP: Rethinking Tensor Parallelism with Memory Deduplication
RTP: Rethinking Tensor Parallelism with Memory Deduplication
Cheng Luo
Tianle Zhong
Geoffrey C. Fox
68
3
0
02 Nov 2023
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
TRANSOM: An Efficient Fault-Tolerant System for Training LLMs
Baodong Wu
Lei Xia
Qingping Li
Kangyu Li
Xu Chen
Yongqiang Guo
Tieyao Xiang
Yuheng Chen
Shigang Li
80
11
0
16 Oct 2023
Distributed Transfer Learning with 4th Gen Intel Xeon Processors
Distributed Transfer Learning with 4th Gen Intel Xeon Processors
Lakshmi Arunachalam
Fahim Mohammad
Vrushabh H. Sanghavi
16
0
0
10 Oct 2023
Rethinking Memory and Communication Cost for Efficient Large Language
  Model Training
Rethinking Memory and Communication Cost for Efficient Large Language Model Training
Chan Wu
Hanxiao Zhang
Lin Ju
Jinjing Huang
Youshao Xiao
...
Siyuan Li
Fanzhuang Meng
Lei Liang
Xiaolu Zhang
Jun Zhou
45
4
0
09 Oct 2023
Ring Attention with Blockwise Transformers for Near-Infinite Context
Ring Attention with Blockwise Transformers for Near-Infinite Context
Hao Liu
Matei A. Zaharia
Pieter Abbeel
114
258
0
03 Oct 2023
AI ensemble for signal detection of higher order gravitational wave
  modes of quasi-circular, spinning, non-precessing binary black hole mergers
AI ensemble for signal detection of higher order gravitational wave modes of quasi-circular, spinning, non-precessing binary black hole mergers
Minyang Tian
Eliu A. Huerta
Huihuo Zheng
53
0
0
29 Sep 2023
Early Churn Prediction from Large Scale User-Product Interaction Time
  Series
Early Churn Prediction from Large Scale User-Product Interaction Time Series
S. Bhattacharjee
Utkarsh Thukral
Nilesh Patil
AI4TS
56
1
0
25 Sep 2023
Toward efficient resource utilization at edge nodes in federated
  learning
Toward efficient resource utilization at edge nodes in federated learning
Sadi Alawadi
Addi Ait-Mlouk
Salman Toor
Andreas Hellander
FedML
70
6
0
19 Sep 2023
Improved particle-flow event reconstruction with scalable neural
  networks for current and future particle detectors
Improved particle-flow event reconstruction with scalable neural networks for current and future particle detectors
J. Pata
Eric Wulff
Farouk Mokhtar
D. Southwick
Mengke Zhang
M. Girone
Javier Duarte
81
1
0
13 Sep 2023
Saturn: An Optimized Data System for Large Model Deep Learning Workloads
Saturn: An Optimized Data System for Large Model Deep Learning Workloads
Kabir Nagrecha
Arun Kumar
110
6
0
03 Sep 2023
FusionAI: Decentralized Training and Deploying LLMs with Massive
  Consumer-Level GPUs
FusionAI: Decentralized Training and Deploying LLMs with Massive Consumer-Level GPUs
Zhenheng Tang
Yuxin Wang
Xin He
Longteng Zhang
Xinglin Pan
...
Rongfei Zeng
Kaiyong Zhao
Shaoshuai Shi
Bingsheng He
Xiaowen Chu
106
30
0
03 Sep 2023
Physics informed Neural Networks applied to the description of
  wave-particle resonance in kinetic simulations of fusion plasmas
Physics informed Neural Networks applied to the description of wave-particle resonance in kinetic simulations of fusion plasmas
J. Kumar
D. Zarzoso
V. Grandgirard
Jana Ebert
Stefan Kesselheim
PINN
51
1
0
23 Aug 2023
Monitoring of Urban Changes with multi-modal Sentinel 1 and 2 Data in
  Mariupol, Ukraine, in 2022/23
Monitoring of Urban Changes with multi-modal Sentinel 1 and 2 Data in Mariupol, Ukraine, in 2022/23
Georg Zitzlsberger
M. Podhorányi
64
0
0
11 Aug 2023
Isolated Scheduling for Distributed Training Tasks in GPU Clusters
Isolated Scheduling for Distributed Training Tasks in GPU Clusters
Xinchi Han
Weihao Jiang
Peirui Cao
Qinwei Yang
Yunzhuo Liu
Shuyao Qi
Sheng-Yuan Lin
Shi-Ming Zhao
48
1
0
10 Aug 2023
Eva: A General Vectorized Approximation Framework for Second-order
  Optimization
Eva: A General Vectorized Approximation Framework for Second-order Optimization
Lin Zhang
Shaoshuai Shi
Yue Liu
79
1
0
04 Aug 2023
CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters
CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters
S. Rajasekaran
M. Ghobadi
Aditya Akella
GNN
87
32
0
01 Aug 2023
Multi-GPU Approach for Training of Graph ML Models on large CFD Meshes
Multi-GPU Approach for Training of Graph ML Models on large CFD Meshes
Sebastian Strönisch
Maximilian Sander
A. Knüpfer
M. Meyer
AI4CE
88
8
0
25 Jul 2023
DyPP: Dynamic Parameter Prediction to Accelerate Convergence of
  Variational Quantum Algorithms
DyPP: Dynamic Parameter Prediction to Accelerate Convergence of Variational Quantum Algorithms
Satwik Kundu
Debarshi Kundu
Swaroop Ghosh
81
0
0
23 Jul 2023
Robust Fully-Asynchronous Methods for Distributed Training over General
  Architecture
Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
Zehan Zhu
Ye Tian
Yan Huang
Jinming Xu
Shibo He
OOD
85
2
0
21 Jul 2023
A Survey From Distributed Machine Learning to Distributed Deep Learning
A Survey From Distributed Machine Learning to Distributed Deep Learning
Mohammad Dehghani
Zahra Yazdanparast
118
0
0
11 Jul 2023
Pollen: High-throughput Federated Learning Simulation via Resource-Aware
  Client Placement
Pollen: High-throughput Federated Learning Simulation via Resource-Aware Client Placement
Lorenzo Sani
Pedro Gusmão
Alexandru Iacob
Wanru Zhao
Xinchi Qiu
Yan Gao
Javier Fernandez-Marques
Nicholas D. Lane
64
0
0
30 Jun 2023
Training Deep Surrogate Models with Large Scale Online Learning
Training Deep Surrogate Models with Large Scale Online Learning
Lucas Meyer
M. Schouler
R. Caulk
Alejandro Ribés
Bruno Raffin
3DGSAI4CE
89
5
0
28 Jun 2023
Evaluation and Optimization of Gradient Compression for Distributed Deep
  Learning
Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
Lin Zhang
Longteng Zhang
Shaoshuai Shi
Xiaowen Chu
Yue Liu
OffRL
45
7
0
15 Jun 2023
DistSim: A performance model of large-scale hybrid distributed DNN
  training
DistSim: A performance model of large-scale hybrid distributed DNN training
Guandong Lu
Run Chen
Yakai Wang
Yangjie Zhou
Rui Zhang
...
Yanming Miao
Zhifang Cai
Li-Wei Li
Jingwen Leng
Minyi Guo
94
12
0
14 Jun 2023
$\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in
  Decentralized Deep Learning
A2CiD2\textbf{A}^2\textbf{CiD}^2A2CiD2: Accelerating Asynchronous Communication in Decentralized Deep Learning
Adel Nabli
Eugene Belilovsky
Edouard Oyallon
74
7
0
14 Jun 2023
How Can We Train Deep Learning Models Across Clouds and Continents? An
  Experimental Study
How Can We Train Deep Learning Models Across Clouds and Continents? An Experimental Study
Alexander Isenko
R. Mayer
Hans-Arno Jacobsen
76
8
0
05 Jun 2023
Data-Efficient French Language Modeling with CamemBERTa
Data-Efficient French Language Modeling with CamemBERTa
Wissam Antoun
Benoît Sagot
Djamé Seddah
52
7
0
02 Jun 2023
Automated Tensor Model Parallelism with Overlapped Communication for
  Efficient Foundation Model Training
Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training
Shengwei Li
Zhiquan Lai
Yanqi Hao
Weijie Liu
Ke-shi Ge
Xiaoge Deng
Dongsheng Li
KaiCheng Lu
66
11
0
25 May 2023
ADA-GP: Accelerating DNN Training By Adaptive Gradient Prediction
ADA-GP: Accelerating DNN Training By Adaptive Gradient Prediction
Vahid Janfaza
Shantanu Mandal
Farabi Mahmud
A. Muzahid
59
2
0
22 May 2023
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems
  for Large-model Training at Scale
ASTRA-sim2.0: Modeling Hierarchical Networks and Disaggregated Systems for Large-model Training at Scale
William Won
Taekyung Heo
Saeed Rashidi
Srinivas Sridharan
Sudarshan Srinivasan
T. Krishna
67
51
0
24 Mar 2023
A Survey on Class Imbalance in Federated Learning
A Survey on Class Imbalance in Federated Learning
Jing Zhang
Chuanwen Li
Jianzgong Qi
Jiayuan He
FedML
88
16
0
21 Mar 2023
MCR-DL: Mix-and-Match Communication Runtime for Deep Learning
MCR-DL: Mix-and-Match Communication Runtime for Deep Learning
Quentin G. Anthony
A. A. Awan
Jeff Rasley
Yuxiong He
Hari Subramoni
Mustafa Abduljabbar
Hari Subramoni
D. Panda
MoE
73
7
0
15 Mar 2023
OCCL: a Deadlock-free Library for GPU Collective Communication
OCCL: a Deadlock-free Library for GPU Collective Communication
Lichen Pan
Juncheng Liu
Jinhui Yuan
Rongkai Zhang
Pengze Li
Zhen Xiao
35
1
0
11 Mar 2023
Cloudless-Training: A Framework to Improve Efficiency of Geo-Distributed
  ML Training
Cloudless-Training: A Framework to Improve Efficiency of Geo-Distributed ML Training
W. Tan
Xiao Shi
Cunchi Lv
Xiaofang Zhao
FedML
61
1
0
09 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of
  Generative AI from GAN to ChatGPT
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
105
555
0
07 Mar 2023
Ada-Grouper: Accelerating Pipeline Parallelism in Preempted Network by
  Adaptive Group-Scheduling for Micro-Batches
Ada-Grouper: Accelerating Pipeline Parallelism in Preempted Network by Adaptive Group-Scheduling for Micro-Batches
Siyu Wang
Zongyan Cao
Chang Si
Lansong Diao
Jiamang Wang
W. Lin
48
0
0
03 Mar 2023
Previous
12345...8910
Next