Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.04432
Cited By
Integrated Model, Batch and Domain Parallelism in Training Neural Networks
12 December 2017
A. Gholami
A. Azad
Peter H. Jin
Kurt Keutzer
A. Buluç
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Integrated Model, Batch and Domain Parallelism in Training Neural Networks"
21 / 21 papers shown
Title
Hardware Scaling Trends and Diminishing Returns in Large-Scale Distributed Training
Jared Fernandez
Luca Wehrstedt
Leonid Shamis
Mostafa Elhoushi
Kalyan Saladi
Yonatan Bisk
Emma Strubell
Jacob Kahn
221
3
0
20 Nov 2024
Neural Network Methods for Radiation Detectors and Imaging
S. Lin
S. Ning
H. Zhu
T. Zhou
C. L. Morris
S. Clayton
M. Cherukara
R. T. Chen
Z. Wang
AI4CE
32
5
0
09 Nov 2023
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
23
3
0
28 Oct 2022
OLLA: Optimizing the Lifetime and Location of Arrays to Reduce the Memory Usage of Neural Networks
Benoit Steiner
Mostafa Elhoushi
Jacob Kahn
James Hegarty
29
8
0
24 Oct 2022
Model-Parallel Model Selection for Deep Learning Systems
Kabir Nagrecha
37
16
0
14 Jul 2021
ResIST: Layer-Wise Decomposition of ResNets for Distributed Training
Chen Dun
Cameron R. Wolfe
C. Jermaine
Anastasios Kyrillidis
16
21
0
02 Jul 2021
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks
A. Kahira
Truong Thao Nguyen
L. Bautista-Gomez
Ryousei Takano
Rosa M. Badia
M. Wahib
15
9
0
19 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
M. Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
37
646
0
09 Apr 2021
GIST: Distributed Training for Large-Scale Graph Convolutional Networks
Cameron R. Wolfe
Jingkang Yang
Arindam Chowdhury
Chen Dun
Artun Bayer
Santiago Segarra
Anastasios Kyrillidis
BDL
GNN
LRM
51
9
0
20 Feb 2021
Integrating Deep Learning in Domain Sciences at Exascale
Rick Archibald
E. Chow
E. DÁzevedo
Jack J. Dongarra
M. Eisenbach
...
Florent Lopez
Daniel Nichols
S. Tomov
Kwai Wong
Junqi Yin
PINN
23
5
0
23 Nov 2020
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
M. Wahib
Haoyu Zhang
Truong Thao Nguyen
Aleksandr Drozd
Jens Domke
Lingqi Zhang
Ryousei Takano
Satoshi Matsuoka
OODD
34
23
0
26 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
32
37
0
25 Jul 2020
ICA-UNet: ICA Inspired Statistical UNet for Real-time 3D Cardiac Cine MRI Segmentation
Tianchen Wang
Xiaowei Xu
Jinjun Xiong
Qianjun Jia
Haiyun Yuan
Meiping Huang
Jian Zhuang
Yiyu Shi
14
21
0
18 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
Reducing Communication in Graph Neural Network Training
Alok Tripathy
Katherine Yelick
A. Buluç
GNN
30
104
0
07 May 2020
Optimizing Multi-GPU Parallelization Strategies for Deep Learning Training
Saptadeep Pal
Eiman Ebrahimi
A. Zulfiqar
Yaosheng Fu
Victor Zhang
Szymon Migacz
D. Nellans
Puneet Gupta
34
55
0
30 Jul 2019
Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism
Nikoli Dryden
N. Maruyama
Tom Benson
Tim Moon
M. Snir
B. Van Essen
26
49
0
15 Mar 2019
Parameter Re-Initialization through Cyclical Batch Size Schedules
Norman Mu
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
30
8
0
04 Dec 2018
SqueezeNext: Hardware-Aware Neural Network Design
A. Gholami
K. Kwon
Bichen Wu
Zizheng Tai
Xiangyu Yue
Peter H. Jin
Sicheng Zhao
Kurt Keutzer
22
295
0
23 Mar 2018
Demystifying Parallel and Distributed Deep Learning: An In-Depth Concurrency Analysis
Tal Ben-Nun
Torsten Hoefler
GNN
33
702
0
26 Feb 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
1