Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train

12 November 2017

Papers citing "Scale out for large minibatch SGD: Residual network training on ImageNet-1K with improved accuracy and reduced time to train"

10 / 10 papers shown

Title
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression Jaeyong Song Jinkyu Yim Jaewon Jung Hongsun Jang H. Kim Youngsok Kim Jinho Lee GNN 24 25 0 24 Jan 2023
Concurrent Adversarial Learning for Large-Batch Training Yong Liu Xiangning Chen Minhao Cheng Cho-Jui Hsieh Yang You ODL 28 13 0 01 Jun 2021
ImageNet-21K Pretraining for the Masses T. Ridnik Emanuel Ben-Baruch Asaf Noy Lihi Zelnik-Manor SSeg VLM CLIP 181 689 0 22 Apr 2021
Deploying Scientific AI Networks at Petaflop Scale on Secure Large Scale HPC Production Systems with Containers D. Brayford S. Vallecorsa 14 8 0 20 May 2020
Communication optimization strategies for distributed deep neural network training: A survey Shuo Ouyang Dezun Dong Yemao Xu Liquan Xiao 30 12 0 06 Mar 2020
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes Yang You Jing Li Sashank J. Reddi Jonathan Hseu Sanjiv Kumar Srinadh Bhojanapalli Xiaodan Song J. Demmel Kurt Keutzer Cho-Jui Hsieh ODL 28 980 0 01 Apr 2019
TF-Replicator: Distributed Machine Learning for Researchers P. Buchlovsky David Budden Dominik Grewe Chris Jones John Aslanides ... Aidan Clark Sergio Gomez Colmenarejo Aedan Pope Fabio Viola Dan Belov GNN OffRL AI4CE 29 20 0 01 Feb 2019
SparCML: High-Performance Sparse Communication for Machine Learning Cédric Renggli Saleh Ashkboos Mehdi Aghagolzadeh Dan Alistarh Torsten Hoefler 29 126 0 22 Feb 2018
On Scale-out Deep Learning Training for Cloud and HPC Srinivas Sridharan K. Vaidyanathan Dhiraj D. Kalamkar Dipankar Das Mikhail E. Smorkalov ... Dheevatsa Mudigere Naveen Mellempudi Sasikanth Avancha Bharat Kaul Pradeep Dubey BDL 26 30 0 24 Jan 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,890 0 15 Sep 2016