ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.06324
  4. Cited By
OCCL: a Deadlock-free Library for GPU Collective Communication

OCCL: a Deadlock-free Library for GPU Collective Communication

11 March 2023
Lichen Pan
Juncheng Liu
Jinhui Yuan
Rongkai Zhang
Pengze Li
Zhen Xiao
ArXivPDFHTML

Papers citing "OCCL: a Deadlock-free Library for GPU Collective Communication"

9 / 9 papers shown
Title
Pathways: Asynchronous Distributed Dataflow for ML
Pathways: Asynchronous Distributed Dataflow for ML
P. Barham
Aakanksha Chowdhery
J. Dean
Sanjay Ghemawat
Steven Hand
...
Parker Schuh
Ryan Sepassi
Laurent El Shafey
C. A. Thekkath
Yonghui Wu
GNN
MoE
98
129
0
23 Mar 2022
TACCL: Guiding Collective Algorithm Synthesis using Communication
  Sketches
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
Aashaka Shah
Vijay Chidambaram
M. Cowan
Saeed Maleki
Madan Musuvathi
Todd Mytkowicz
Jacob Nelson
Olli Saarikivi
Rachee Singh
35
57
0
08 Nov 2021
Maximizing Parallelism in Distributed Training for Huge Neural Networks
Maximizing Parallelism in Distributed Training for Huge Neural Networks
Zhengda Bian
Qifan Xu
Boxiang Wang
Yang You
MoE
39
46
0
30 May 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using
  Megatron-LM
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
82
682
0
09 Apr 2021
Blink: Fast and Generic Collectives for Distributed ML
Blink: Fast and Generic Collectives for Distributed ML
Guanhua Wang
Shivaram Venkataraman
Amar Phanishayee
J. Thelin
Nikhil R. Devanur
Ion Stoica
VLM
50
138
0
11 Oct 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using
  Model Parallelism
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
310
1,892
0
17 Sep 2019
Priority-based Parameter Propagation for Distributed DNN Training
Priority-based Parameter Propagation for Distributed DNN Training
Anand Jayarajan
Jinliang Wei
Garth A. Gibson
Alexandra Fedorova
Gennady Pekhimenko
AI4CE
49
180
0
10 May 2019
TicTac: Accelerating Distributed Deep Learning with Communication
  Scheduling
TicTac: Accelerating Distributed Deep Learning with Communication Scheduling
Sayed Hadi Hashemi
Sangeetha Abdu Jyothi
R. Campbell
36
198
0
08 Mar 2018
Poseidon: An Efficient Communication Architecture for Distributed Deep
  Learning on GPU Clusters
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Huatian Zhang
Zeyu Zheng
Shizhen Xu
Wei-Ming Dai
Qirong Ho
Xiaodan Liang
Zhiting Hu
Jinliang Wei
P. Xie
Eric Xing
GNN
67
344
0
11 Jun 2017
1