Better Schedules for Low Precision Training of Deep Neural Networks

4 March 2024

Papers citing "Better Schedules for Low Precision Training of Deep Neural Networks"

21 / 21 papers shown

Title
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model BigScience Workshop : Teven Le Scao Angela Fan Christopher Akiki ... Zhongli Xie Zifan Ye M. Bras Younes Belkada Thomas Wolf VLM 344 2,377 0 09 Nov 2022
PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication Cheng Wan Youjie Li Cameron R. Wolfe Anastasios Kyrillidis Namjae Kim Yingyan Lin GNN 58 70 0 20 Mar 2022
PROFIT: A Novel Training Method for sub-4-bit MobileNet Models Eunhyeok Park S. Yoo MQ 40 85 0 11 Aug 2020
Open Graph Benchmark: Datasets for Machine Learning on Graphs Weihua Hu Matthias Fey Marinka Zitnik Yuxiao Dong Hongyu Ren Bowen Liu Michele Catasta J. Leskovec 271 2,719 0 02 May 2020
Demon: Improved Neural Network Training with Momentum Decay John Chen Cameron R. Wolfe Zhaoqi Li Anastasios Kyrillidis ODL 49 15 0 11 Oct 2019
Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers Yukuan Yang Shuang Wu Lei Deng Tianyi Yan Yuan Xie Guoqi Li MQ 129 112 0 05 Sep 2019
Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence Aditya Golatkar Alessandro Achille Stefano Soatto 71 96 0 30 May 2019
Learned Step Size Quantization S. K. Esser J. McKinstry Deepika Bablani R. Appuswamy D. Modha MQ 69 798 0 21 Feb 2019
HAQ: Hardware-Aware Automated Quantization with Mixed Precision Kuan-Chieh Wang Zhijian Liu Chengyue Wu Ji Lin Song Han MQ 115 879 0 21 Nov 2018
Scalable Methods for 8-bit Training of Neural Networks Ron Banner Itay Hubara Elad Hoffer Daniel Soudry MQ 84 337 0 25 May 2018
A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay L. Smith 271 1,028 0 26 Mar 2018
MobileNetV2: Inverted Residuals and Linear Bottlenecks Mark Sandler Andrew G. Howard Menglong Zhu A. Zhmoginov Liang-Chieh Chen 169 19,204 0 13 Jan 2018
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference Benoit Jacob S. Kligys Bo Chen Menglong Zhu Matthew Tang Andrew G. Howard Hartwig Adam Dmitry Kalenichenko MQ 136 3,111 0 15 Dec 2017
SkipNet: Learning Dynamic Routing in Convolutional Networks Xin Wang Feng Yu Zi-Yi Dou Trevor Darrell Joseph E. Gonzalez 66 633 0 26 Nov 2017
Inductive Representation Learning on Large Graphs William L. Hamilton Z. Ying J. Leskovec 446 15,179 0 07 Jun 2017
Semi-Supervised Classification with Graph Convolutional Networks Thomas Kipf Max Welling GNN SSL 571 28,964 0 09 Sep 2016
Pruning Filters for Efficient ConvNets Hao Li Asim Kadav Igor Durdanovic H. Samet H. Graf 3DPC 186 3,692 0 31 Aug 2016
SGDR: Stochastic Gradient Descent with Warm Restarts I. Loshchilov Frank Hutter ODL 288 8,091 0 13 Aug 2016
Ternary Weight Networks Fengfu Li Bin Liu Xiaoxing Wang Bo Zhang Junchi Yan MQ 59 525 0 16 May 2016
Cyclical Learning Rates for Training Neural Networks L. Smith ODL 163 2,517 0 03 Jun 2015
Recurrent Neural Network Regularization Wojciech Zaremba Ilya Sutskever Oriol Vinyals ODL 123 2,774 0 08 Sep 2014