Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

13 October 2022

Papers citing "Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities"

41 / 91 papers shown

Title
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning Z. Yao A. Gholami Sheng Shen Mustafa Mustafa Kurt Keutzer Michael W. Mahoney ODL 75 281 0 01 Jun 2020
The Cost of Training NLP Models: A Concise Overview Or Sharir Barak Peleg Y. Shoham 68 210 0 19 Apr 2020
Gradient Centralization: A New Optimization Technique for Deep Neural Networks Hongwei Yong Jianqiang Huang Xiansheng Hua Lei Zhang ODL 49 185 0 03 Apr 2020
What is the State of Neural Network Pruning? Davis W. Blalock Jose Javier Gonzalez Ortiz Jonathan Frankle John Guttag 250 1,045 0 06 Mar 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers Zhuohan Li Eric Wallace Sheng Shen Kevin Lin Kurt Keutzer Dan Klein Joseph E. Gonzalez 85 150 0 26 Feb 2020
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning Peter Henderson Jie Hu Joshua Romoff Emma Brunskill Dan Jurafsky Joelle Pineau 71 445 0 31 Jan 2020
Approximating Activation Functions Nicholas Gerard Timmons Andrew Rice 35 14 0 17 Jan 2020
Quantifying the Carbon Emissions of Machine Learning Alexandre Lacoste A. Luccioni Victor Schmidt Thomas Dandres 82 688 0 21 Oct 2019
Accelerating Deep Learning by Focusing on the Biggest Losers Angela H. Jiang Daniel L.-K. Wong Giulio Zhou D. Andersen J. Dean ... Gauri Joshi M. Kaminsky M. Kozuch Zachary Chase Lipton Padmanabhan Pillai 49 120 0 02 Oct 2019
Reducing Transformer Depth on Demand with Structured Dropout Angela Fan Edouard Grave Armand Joulin 98 588 0 25 Sep 2019
Fixing the train-test resolution discrepancy Hugo Touvron Andrea Vedaldi Matthijs Douze Hervé Jégou 109 423 0 14 Jun 2019
PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization Thijs Vogels Sai Praneeth Karimireddy Martin Jaggi 64 320 0 31 May 2019
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features Sangdoo Yun Dongyoon Han Seong Joon Oh Sanghyuk Chun Junsuk Choe Y. Yoo OOD 587 4,735 0 13 May 2019
On The Power of Curriculum Learning in Training Deep Networks Guy Hacohen D. Weinshall ODL 68 443 0 07 Apr 2019
Augment your batch: better training with larger batches Elad Hoffer Tal Ben-Nun Itay Hubara Niv Giladi Torsten Hoefler Daniel Soudry ODL 95 73 0 27 Jan 2019
An Empirical Study of Example Forgetting during Deep Neural Network Learning Mariya Toneva Alessandro Sordoni Rémi Tachet des Combes Adam Trischler Yoshua Bengio Geoffrey J. Gordon 105 723 0 12 Dec 2018
Understanding and correcting pathologies in the training of learned optimizers Luke Metz Niru Maheswaranathan Jeremy Nixon C. Freeman Jascha Narain Sohl-Dickstein ODL 67 148 0 24 Oct 2018
Local SGD Converges Fast and Communicates Little Sebastian U. Stich FedML 154 1,056 0 24 May 2018
AutoAugment: Learning Augmentation Policies from Data E. D. Cubuk Barret Zoph Dandelion Mané Vijay Vasudevan Quoc V. Le 105 1,764 0 24 May 2018
Faster Neural Network Training with Approximate Tensor Operations Menachem Adelman Kfir Y. Levy Ido Hakimi M. Silberstein 56 26 0 21 May 2018
Group Normalization Yuxin Wu Kaiming He 164 3,626 0 22 Mar 2018
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD Sanghamitra Dutta Gauri Joshi Soumyadip Ghosh Parijat Dube P. Nagpurkar 53 196 0 03 Mar 2018
Shampoo: Preconditioned Stochastic Tensor Optimization Vineet Gupta Tomer Koren Y. Singer ODL 59 214 0 26 Feb 2018
Don't Decay the Learning Rate, Increase the Batch Size Samuel L. Smith Pieter-Jan Kindermans Chris Ying Quoc V. Le ODL 93 990 0 01 Nov 2017
Regularization for Deep Learning: A Taxonomy J. Kukačka Vladimir Golkov Daniel Cremers 77 335 0 29 Oct 2017
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates L. Smith Nicholay Topin AI4CE 74 520 0 23 Aug 2017
Learning Efficient Convolutional Networks through Network Slimming Zhuang Liu Jianguo Li Zhiqiang Shen Gao Huang Shoumeng Yan Changshui Zhang 113 2,407 0 22 Aug 2017
Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science Decebal Constantin Mocanu Elena Mocanu Peter Stone Phuong H. Nguyen M. Gibescu A. Liotta 131 625 0 15 Jul 2017
FreezeOut: Accelerate Training by Progressively Freezing Layers Andrew Brock Theodore Lim J. Ritchie Nick Weston 37 123 0 15 Jun 2017
Learning Deep ResNet Blocks Sequentially using Boosting Theory Furong Huang Jordan T. Ash John Langford Robert Schapire 43 111 0 15 Jun 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks Chelsea Finn Pieter Abbeel Sergey Levine OOD 770 11,793 0 09 Mar 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 375 2,922 0 15 Sep 2016
Using the Output Embedding to Improve Language Models Ofir Press Lior Wolf 53 733 0 20 Aug 2016
SGDR: Stochastic Gradient Descent with Warm Restarts I. Loshchilov Frank Hutter ODL 242 8,030 0 13 Aug 2016
Deep Networks with Stochastic Depth Gao Huang Yu Sun Zhuang Liu Daniel Sedra Kilian Q. Weinberger 160 2,344 0 30 Mar 2016
Online Batch Selection for Faster Training of Neural Networks I. Loshchilov Frank Hutter ODL 74 299 0 19 Nov 2015
ACDC: A Structured Efficient Linear Layer Marcin Moczulski Misha Denil J. Appleyard Nando de Freitas 62 98 0 18 Nov 2015
8-Bit Approximations for Parallelism in Deep Learning Tim Dettmers 60 176 0 14 Nov 2015
Cyclical Learning Rates for Training Neural Networks L. Smith ODL 143 2,515 0 03 Jun 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He Xinming Zhang Shaoqing Ren Jian Sun VLM 217 18,534 0 06 Feb 2015
Distributed Representations of Words and Phrases and their Compositionality Tomas Mikolov Ilya Sutskever Kai Chen G. Corrado J. Dean NAI OCL 313 33,445 0 16 Oct 2013