Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.07354
Cited By
Fault Tolerance in Iterative-Convergent Machine Learning
17 October 2018
Aurick Qiao
Bryon Aragam
Bingjing Zhang
Eric Xing
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Fault Tolerance in Iterative-Convergent Machine Learning"
2 / 2 papers shown
Title
Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters
Huatian Zhang
Zeyu Zheng
Shizhen Xu
Wei-Ming Dai
Qirong Ho
Xiaodan Liang
Zhiting Hu
Jinliang Wei
P. Xie
Eric Xing
GNN
47
343
0
11 Jun 2017
Speeding Up Distributed Machine Learning Using Codes
Kangwook Lee
Maximilian Lam
Ramtin Pedarsani
Dimitris Papailiopoulos
Kannan Ramchandran
126
856
0
08 Dec 2015
1