Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1807.05358
Cited By
Beyond Data and Model Parallelism for Deep Neural Networks
14 July 2018
Zhihao Jia
Matei A. Zaharia
A. Aiken
GNN
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond Data and Model Parallelism for Deep Neural Networks"
34 / 84 papers shown
Title
MAFAT: Memory-Aware Fusing and Tiling of Neural Networks for Accelerated Edge Inference
J. Farley
A. Gerstlauer
FedML
28
5
0
14 Jul 2021
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines
Shigang Li
Torsten Hoefler
GNN
AI4CE
LRM
80
131
0
14 Jul 2021
Model-Parallel Model Selection for Deep Learning Systems
Kabir Nagrecha
37
16
0
14 Jul 2021
FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks
Sheng-Chun Kao
Suvinay Subramanian
Gaurav Agrawal
Amir Yazdanbakhsh
T. Krishna
38
57
0
13 Jul 2021
BAGUA: Scaling up Distributed Learning with System Relaxations
Shaoduo Gan
Xiangru Lian
Rui Wang
Jianbin Chang
Chengjun Liu
...
Jiawei Jiang
Binhang Yuan
Sen Yang
Ji Liu
Ce Zhang
25
30
0
03 Jul 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
58
815
0
14 Jun 2021
GSPMD: General and Scalable Parallelization for ML Computation Graphs
Yuanzhong Xu
HyoukJoong Lee
Dehao Chen
Blake A. Hechtman
Yanping Huang
...
Noam M. Shazeer
Shibo Wang
Tao Wang
Yonghui Wu
Zhifeng Chen
MoE
28
128
0
10 May 2021
CoSA: Scheduling by Constrained Optimization for Spatial Accelerators
Qijing Huang
Minwoo Kang
Grace Dinh
Thomas Norell
Aravind Kalaiah
J. Demmel
J. Wawrzynek
Y. Shao
23
105
0
05 May 2021
PanGu-
α
α
α
: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
35
212
0
26 Apr 2021
Partitioning sparse deep neural networks for scalable training and inference
G. Demirci
Hakan Ferhatosmanoglu
20
11
0
23 Apr 2021
An Oracle for Guiding Large-Scale Model/Hybrid Parallel Training of Convolutional Neural Networks
A. Kahira
Truong Thao Nguyen
L. Bautista-Gomez
Ryousei Takano
Rosa M. Badia
M. Wahib
15
9
0
19 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
M. Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
37
646
0
09 Apr 2021
On the Utility of Gradient Compression in Distributed Training Systems
Saurabh Agarwal
Hongyi Wang
Shivaram Venkataraman
Dimitris Papailiopoulos
31
46
0
28 Feb 2021
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference
Michael Lui
Yavuz Yetim
Özgür Özkan
Zhuoran Zhao
Shin-Yeh Tsai
Carole-Jean Wu
Mark Hempstead
GNN
BDL
LRM
22
51
0
04 Nov 2020
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference
Yujeong Choi
Yunseong Kim
Minsoo Rhu
24
66
0
25 Oct 2020
Towards a Scalable and Distributed Infrastructure for Deep Learning Applications
Bita Hasheminezhad
S. Shirzad
Nanmiao Wu
Patrick Diehl
Hannes Schulz
Hartmut Kaiser
GNN
AI4CE
27
4
0
06 Oct 2020
Computing Graph Neural Networks: A Survey from Algorithms to Accelerators
S. Abadal
Akshay Jain
Robert Guirado
Jorge López-Alonso
Eduard Alarcón
GNN
36
225
0
30 Sep 2020
VirtualFlow: Decoupling Deep Learning Models from the Underlying Hardware
Andrew Or
Haoyu Zhang
M. Freedman
17
9
0
20 Sep 2020
Scaling Distributed Deep Learning Workloads beyond the Memory Capacity with KARMA
M. Wahib
Haoyu Zhang
Truong Thao Nguyen
Aleksandr Drozd
Jens Domke
Lingqi Zhang
Ryousei Takano
Satoshi Matsuoka
OODD
34
23
0
26 Aug 2020
A Learned Performance Model for Tensor Processing Units
Samuel J. Kaufman
P. Phothilimthana
Yanqi Zhou
Charith Mendis
Sudip Roy
Amit Sabne
Mike Burrows
21
8
0
03 Aug 2020
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama
N. Maruyama
Nikoli Dryden
Erin McCarthy
P. Harrington
J. Balewski
Satoshi Matsuoka
Peter Nugent
B. Van Essen
3DV
AI4CE
32
37
0
25 Jul 2020
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan
Yi Rong
Chen Meng
Zongyan Cao
Siyu Wang
...
Jun Yang
Lixue Xia
Lansong Diao
Xiaoyong Liu
Wei Lin
21
232
0
02 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
36
131
0
30 Jun 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Dmitry Lepikhin
HyoukJoong Lee
Yuanzhong Xu
Dehao Chen
Orhan Firat
Yanping Huang
M. Krikun
Noam M. Shazeer
Z. Chen
MoE
43
1,108
0
30 Jun 2020
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan
Amar Phanishayee
Kaiyu Shi
Xie Chen
Matei A. Zaharia
MoE
36
212
0
16 Jun 2020
HetPipe: Enabling Large DNN Training on (Whimpy) Heterogeneous GPU Clusters through Integration of Pipelined Model Parallelism and Data Parallelism
Jay H. Park
Gyeongchan Yun
Chang Yi
N. T. Nguyen
Seungmin Lee
Jaesik Choi
S. Noh
Young-ri Choi
MoE
25
128
0
28 May 2020
Exascale Deep Learning for Scientific Inverse Problems
N. Laanait
Josh Romero
Junqi Yin
M. T. Young
Sean Treichler
V. Starchenko
A. Borisevich
Alexander Sergeev
Michael A. Matheson
FedML
BDL
35
29
0
24 Sep 2019
Taming Momentum in a Distributed Asynchronous Environment
Ido Hakimi
Saar Barkai
Moshe Gabel
Assaf Schuster
16
23
0
26 Jul 2019
Fully Decoupled Neural Network Learning Using Delayed Gradients
Huiping Zhuang
Yi Wang
Qinglai Liu
Shuai Zhang
Zhiping Lin
FedML
14
29
0
21 Jun 2019
RLgraph: Modular Computation Graphs for Deep Reinforcement Learning
Michael Schaarschmidt
Sven Mika
Kai Fricke
Eiko Yoneki
OffRL
23
5
0
21 Oct 2018
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang
Chien-chin Huang
Jinyang Li
46
154
0
24 Jul 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman
Daniel Kang
Deepak Narayanan
Luigi Nardi
Tian Zhao
Jian Zhang
Peter Bailis
K. Olukotun
Christopher Ré
Matei A. Zaharia
13
117
0
04 Jun 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
Convolutional Neural Networks for Sentence Classification
Yoon Kim
AILaw
VLM
267
13,368
0
25 Aug 2014
Previous
1
2