ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.06640
  4. Cited By
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

13 October 2022
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
ArXivPDFHTML

Papers citing "Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities"

50 / 91 papers shown
Title
Is Adversarial Training with Compressed Datasets Effective?
Is Adversarial Training with Compressed Datasets Effective?
Tong Chen
Raghavendra Selvan
AAML
109
0
0
08 Feb 2024
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Dustin Wright
Christian Igel
Gabrielle Samuel
Raghavendra Selvan
70
15
0
05 Sep 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
79
4
0
18 Aug 2023
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language
  Model
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model
A. Luccioni
S. Viguier
Anne-Laure Ligozat
84
272
0
03 Nov 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with
  Latest Weight Averaging
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
45
41
0
29 Sep 2022
Beyond neural scaling laws: beating power law scaling via data pruning
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
64
436
0
29 Jun 2022
The Carbon Footprint of Machine Learning Training Will Plateau, Then
  Shrink
The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink
David A. Patterson
Joseph E. Gonzalez
Urs Holzle
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
64
239
0
11 Apr 2022
Monarch: Expressive Structured Matrices for Efficient and Accurate
  Training
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Tri Dao
Beidi Chen
N. Sohoni
Arjun D Desai
Michael Poli
Jessica Grogan
Alexander Liu
Aniruddh Rao
Atri Rudra
Christopher Ré
76
90
0
01 Apr 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
137
1,915
0
29 Mar 2022
Automated Progressive Learning for Efficient Training of Vision
  Transformers
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
67
46
0
28 Mar 2022
Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on
  Edge Devices
Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on Edge Devices
K. Bhardwaj
James Diffenderfer
B. Kailkhura
Maya Gokhale
AAML
OOD
67
3
0
21 Mar 2022
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient
  Hyper-parameter Tuning
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning
Krishnateja Killamsetty
Guttu Sai Abhishek
Aakriti
A. Evfimievski
Lucian Popa
Ganesh Ramakrishnan
Rishabh K. Iyer
31
25
0
15 Mar 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot
  Hyperparameter Transfer
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
64
155
0
07 Mar 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1
  Adam
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
OffRL
AI4CE
40
21
0
12 Feb 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to
  Power Next-Generation AI Scale
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Samyam Rajbhandari
Conglong Li
Z. Yao
Minjia Zhang
Reza Yazdani Aminabadi
A. A. Awan
Jeff Rasley
Yuxiong He
61
292
0
14 Jan 2022
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
184
1,783
0
18 Nov 2021
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders
  up to 100 Trillion Parameters
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
Xiangru Lian
Binhang Yuan
Xuefeng Zhu
Yulong Wang
Yongjun He
...
Lei Yuan
Hai-bo Yu
Sen Yang
Ce Zhang
Ji Liu
VLM
47
34
0
10 Nov 2021
MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the
  Edge
MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge
Geng Yuan
Xiaolong Ma
Wei Niu
Zhengang Li
Zhenglun Kong
...
Minghai Qin
Bin Ren
Yanzhi Wang
Sijia Liu
Xue Lin
45
93
0
26 Oct 2021
The Efficiency Misnomer
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
74
101
0
25 Oct 2021
8-bit Optimizers via Block-wise Quantization
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
99
286
0
06 Oct 2021
ResNet strikes back: An improved training procedure in timm
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
230
489
0
01 Oct 2021
Primer: Searching for Efficient Transformers for Language Modeling
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
203
154
0
17 Sep 2021
Deep Learning on a Data Diet: Finding Important Examples Early in
  Training
Deep Learning on a Data Diet: Finding Important Examples Early in Training
Mansheej Paul
Surya Ganguli
Gintare Karolina Dziugaite
93
446
0
15 Jul 2021
Physics-Guided Deep Learning for Dynamical Systems: A Survey
Physics-Guided Deep Learning for Dynamical Systems: A Survey
Rui Wang
Rose Yu
AI4CE
PINN
72
67
0
02 Jul 2021
Deep Ensembling with No Overhead for either Training or Testing: The
  All-Round Blessings of Dynamic Sparsity
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
OOD
47
52
0
28 Jun 2021
Lossy Compression for Lossless Prediction
Lossy Compression for Lossless Prediction
Yann Dubois
Benjamin Bloem-Reddy
Karen Ullrich
Chris J. Maddison
68
60
0
21 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision
  Transformers
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
101
623
0
18 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
119
836
0
14 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
93
1,188
0
09 Jun 2021
Drawing Multiple Augmentation Samples Per Image During Training
  Efficiently Decreases Test Error
Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error
Stanislav Fort
Andrew Brock
Razvan Pascanu
Soham De
Samuel L. Smith
43
31
0
27 May 2021
Carbon Emissions and Large Neural Network Training
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
321
658
0
21 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using
  Megatron-LM
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
74
667
0
09 Apr 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
62
299
0
13 Mar 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
329
4,873
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
317
180
0
17 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
254
514
0
11 Feb 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference
  and training in neural networks
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
264
703
0
31 Jan 2021
Bottleneck Transformers for Visual Recognition
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
327
986
0
27 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
54
57
0
21 Jan 2021
RepVGG: Making VGG-style ConvNets Great Again
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
233
1,574
0
11 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
71
100
0
31 Dec 2020
Deconstructing the Structure of Sparse Neural Networks
Deconstructing the Structure of Sparse Neural Networks
M. V. Gelder
Mitchell Wortsman
Kiana Ehsani
26
1
0
30 Nov 2020
FreezeNet: Full Performance by Reduced Storage Costs
FreezeNet: Full Performance by Reduced Storage Costs
Paul Wimmer
Jens Mehnert
Alexandru Paul Condurache
47
13
0
28 Nov 2020
Long Range Arena: A Benchmark for Efficient Transformers
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
110
706
0
08 Nov 2020
$μ$NAS: Constrained Neural Architecture Search for Microcontrollers
μμμNAS: Constrained Neural Architecture Search for Microcontrollers
Edgar Liberis
Łukasz Dudziak
Nicholas D. Lane
BDL
28
104
0
27 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
171
1,323
0
03 Oct 2020
Rethinking Attention with Performers
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
148
1,548
0
30 Sep 2020
The Computational Limits of Deep Learning
The Computational Limits of Deep Learning
Neil C. Thompson
Kristjan Greenewald
Keeheon Lee
Gabriel F. Manso
VLM
46
518
0
10 Jul 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear
  Attention
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
132
1,734
0
29 Jun 2020
Linformer: Self-Attention with Linear Complexity
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
179
1,678
0
08 Jun 2020
12
Next