ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.06640
  4. Cited By
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

13 October 2022
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
ArXivPDFHTML

Papers citing "Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities"

41 / 91 papers shown
Title
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
75
281
0
01 Jun 2020
The Cost of Training NLP Models: A Concise Overview
The Cost of Training NLP Models: A Concise Overview
Or Sharir
Barak Peleg
Y. Shoham
68
210
0
19 Apr 2020
Gradient Centralization: A New Optimization Technique for Deep Neural
  Networks
Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Hongwei Yong
Jianqiang Huang
Xiansheng Hua
Lei Zhang
ODL
49
185
0
03 Apr 2020
What is the State of Neural Network Pruning?
What is the State of Neural Network Pruning?
Davis W. Blalock
Jose Javier Gonzalez Ortiz
Jonathan Frankle
John Guttag
250
1,045
0
06 Mar 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training
  and Inference of Transformers
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
85
150
0
26 Feb 2020
Towards the Systematic Reporting of the Energy and Carbon Footprints of
  Machine Learning
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning
Peter Henderson
Jie Hu
Joshua Romoff
Emma Brunskill
Dan Jurafsky
Joelle Pineau
71
445
0
31 Jan 2020
Approximating Activation Functions
Approximating Activation Functions
Nicholas Gerard Timmons
Andrew Rice
35
14
0
17 Jan 2020
Quantifying the Carbon Emissions of Machine Learning
Quantifying the Carbon Emissions of Machine Learning
Alexandre Lacoste
A. Luccioni
Victor Schmidt
Thomas Dandres
82
688
0
21 Oct 2019
Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Angela H. Jiang
Daniel L.-K. Wong
Giulio Zhou
D. Andersen
J. Dean
...
Gauri Joshi
M. Kaminsky
M. Kozuch
Zachary Chase Lipton
Padmanabhan Pillai
49
120
0
02 Oct 2019
Reducing Transformer Depth on Demand with Structured Dropout
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
98
588
0
25 Sep 2019
Fixing the train-test resolution discrepancy
Fixing the train-test resolution discrepancy
Hugo Touvron
Andrea Vedaldi
Matthijs Douze
Hervé Jégou
109
423
0
14 Jun 2019
PowerSGD: Practical Low-Rank Gradient Compression for Distributed
  Optimization
PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization
Thijs Vogels
Sai Praneeth Karimireddy
Martin Jaggi
64
320
0
31 May 2019
CutMix: Regularization Strategy to Train Strong Classifiers with
  Localizable Features
CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features
Sangdoo Yun
Dongyoon Han
Seong Joon Oh
Sanghyuk Chun
Junsuk Choe
Y. Yoo
OOD
587
4,735
0
13 May 2019
On The Power of Curriculum Learning in Training Deep Networks
On The Power of Curriculum Learning in Training Deep Networks
Guy Hacohen
D. Weinshall
ODL
68
443
0
07 Apr 2019
Augment your batch: better training with larger batches
Augment your batch: better training with larger batches
Elad Hoffer
Tal Ben-Nun
Itay Hubara
Niv Giladi
Torsten Hoefler
Daniel Soudry
ODL
95
73
0
27 Jan 2019
An Empirical Study of Example Forgetting during Deep Neural Network
  Learning
An Empirical Study of Example Forgetting during Deep Neural Network Learning
Mariya Toneva
Alessandro Sordoni
Rémi Tachet des Combes
Adam Trischler
Yoshua Bengio
Geoffrey J. Gordon
105
723
0
12 Dec 2018
Understanding and correcting pathologies in the training of learned
  optimizers
Understanding and correcting pathologies in the training of learned optimizers
Luke Metz
Niru Maheswaranathan
Jeremy Nixon
C. Freeman
Jascha Narain Sohl-Dickstein
ODL
67
148
0
24 Oct 2018
Local SGD Converges Fast and Communicates Little
Local SGD Converges Fast and Communicates Little
Sebastian U. Stich
FedML
154
1,056
0
24 May 2018
AutoAugment: Learning Augmentation Policies from Data
AutoAugment: Learning Augmentation Policies from Data
E. D. Cubuk
Barret Zoph
Dandelion Mané
Vijay Vasudevan
Quoc V. Le
105
1,764
0
24 May 2018
Faster Neural Network Training with Approximate Tensor Operations
Faster Neural Network Training with Approximate Tensor Operations
Menachem Adelman
Kfir Y. Levy
Ido Hakimi
M. Silberstein
56
26
0
21 May 2018
Group Normalization
Group Normalization
Yuxin Wu
Kaiming He
164
3,626
0
22 Mar 2018
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in
  Distributed SGD
Slow and Stale Gradients Can Win the Race: Error-Runtime Trade-offs in Distributed SGD
Sanghamitra Dutta
Gauri Joshi
Soumyadip Ghosh
Parijat Dube
P. Nagpurkar
53
196
0
03 Mar 2018
Shampoo: Preconditioned Stochastic Tensor Optimization
Shampoo: Preconditioned Stochastic Tensor Optimization
Vineet Gupta
Tomer Koren
Y. Singer
ODL
59
214
0
26 Feb 2018
Don't Decay the Learning Rate, Increase the Batch Size
Don't Decay the Learning Rate, Increase the Batch Size
Samuel L. Smith
Pieter-Jan Kindermans
Chris Ying
Quoc V. Le
ODL
93
990
0
01 Nov 2017
Regularization for Deep Learning: A Taxonomy
Regularization for Deep Learning: A Taxonomy
J. Kukačka
Vladimir Golkov
Daniel Cremers
77
335
0
29 Oct 2017
Super-Convergence: Very Fast Training of Neural Networks Using Large
  Learning Rates
Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates
L. Smith
Nicholay Topin
AI4CE
74
520
0
23 Aug 2017
Learning Efficient Convolutional Networks through Network Slimming
Learning Efficient Convolutional Networks through Network Slimming
Zhuang Liu
Jianguo Li
Zhiqiang Shen
Gao Huang
Shoumeng Yan
Changshui Zhang
113
2,407
0
22 Aug 2017
Scalable Training of Artificial Neural Networks with Adaptive Sparse
  Connectivity inspired by Network Science
Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science
Decebal Constantin Mocanu
Elena Mocanu
Peter Stone
Phuong H. Nguyen
M. Gibescu
A. Liotta
131
625
0
15 Jul 2017
FreezeOut: Accelerate Training by Progressively Freezing Layers
FreezeOut: Accelerate Training by Progressively Freezing Layers
Andrew Brock
Theodore Lim
J. Ritchie
Nick Weston
37
123
0
15 Jun 2017
Learning Deep ResNet Blocks Sequentially using Boosting Theory
Learning Deep ResNet Blocks Sequentially using Boosting Theory
Furong Huang
Jordan T. Ash
John Langford
Robert Schapire
43
111
0
15 Jun 2017
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks
Chelsea Finn
Pieter Abbeel
Sergey Levine
OOD
770
11,793
0
09 Mar 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
375
2,922
0
15 Sep 2016
Using the Output Embedding to Improve Language Models
Using the Output Embedding to Improve Language Models
Ofir Press
Lior Wolf
53
733
0
20 Aug 2016
SGDR: Stochastic Gradient Descent with Warm Restarts
SGDR: Stochastic Gradient Descent with Warm Restarts
I. Loshchilov
Frank Hutter
ODL
242
8,030
0
13 Aug 2016
Deep Networks with Stochastic Depth
Deep Networks with Stochastic Depth
Gao Huang
Yu Sun
Zhuang Liu
Daniel Sedra
Kilian Q. Weinberger
160
2,344
0
30 Mar 2016
Online Batch Selection for Faster Training of Neural Networks
Online Batch Selection for Faster Training of Neural Networks
I. Loshchilov
Frank Hutter
ODL
74
299
0
19 Nov 2015
ACDC: A Structured Efficient Linear Layer
ACDC: A Structured Efficient Linear Layer
Marcin Moczulski
Misha Denil
J. Appleyard
Nando de Freitas
62
98
0
18 Nov 2015
8-Bit Approximations for Parallelism in Deep Learning
8-Bit Approximations for Parallelism in Deep Learning
Tim Dettmers
60
176
0
14 Nov 2015
Cyclical Learning Rates for Training Neural Networks
Cyclical Learning Rates for Training Neural Networks
L. Smith
ODL
143
2,515
0
03 Jun 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on
  ImageNet Classification
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
217
18,534
0
06 Feb 2015
Distributed Representations of Words and Phrases and their
  Compositionality
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
313
33,445
0
16 Oct 2013
Previous
12