Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.19580
Cited By
FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models
28 June 2024
Saeed Rashidi
William Won
Sudarshan Srinivasan
Puneet Gupta
Tushar Krishna
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FRED: Flexible REduction-Distribution Interconnect and Communication Implementation for Wafer-Scale Distributed Training of DNN Models"
10 / 10 papers shown
Title
Themis: A Network Bandwidth-Aware Collective Scheduling Policy for Distributed Training of DL Models
Saeed Rashidi
William Won
Sudarshan Srinivasan
Srinivas Sridharan
T. Krishna
GNN
35
31
0
09 Oct 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
49
667
0
09 Apr 2021
Highly Available Data Parallel ML training on Mesh Networks
Sameer Kumar
N. Jouppi
MoE
AI4CE
14
9
0
06 Nov 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Zhiwen Chen
MoE
62
1,142
0
30 Jun 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
330
41,106
0
28 May 2020
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
268
1,861
0
17 Sep 2019
Deep Learning Recommendation Model for Personalization and Recommendation Systems
Maxim Naumov
Dheevatsa Mudigere
Hao-Jun Michael Shi
Jianyu Huang
Narayanan Sundaraman
...
Wenlin Chen
Vijay Rao
Bill Jia
Liang Xiong
M. Smelyanskiy
37
726
0
31 May 2019
Evaluating Modern GPU Interconnect: PCIe, NVLink, NV-SLI, NVSwitch and GPUDirect
Ang Li
Shuaiwen Leon Song
Jieyang Chen
Jiajia Li
Xu Liu
Nathan R. Tallent
Kevin J. Barker
GNN
35
213
0
11 Mar 2019
Beyond Data and Model Parallelism for Deep Neural Networks
Zhihao Jia
Matei A. Zaharia
A. Aiken
GNN
AI4CE
47
501
0
14 Jul 2018
Deep Residual Learning for Image Recognition
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
MedIm
866
192,638
0
10 Dec 2015
1