Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.06640
Cited By
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
13 October 2022
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities"
50 / 91 papers shown
Title
Is Adversarial Training with Compressed Datasets Effective?
Tong Chen
Raghavendra Selvan
AAML
109
0
0
08 Feb 2024
Efficiency is Not Enough: A Critical Perspective of Environmentally Sustainable AI
Dustin Wright
Christian Igel
Gabrielle Samuel
Raghavendra Selvan
70
15
0
05 Sep 2023
Which Transformer to Favor: A Comparative Analysis of Efficiency in Vision Transformers
Tobias Christian Nauen
Sebastián M. Palacio
Federico Raue
Andreas Dengel
79
4
0
18 Aug 2023
Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model
A. Luccioni
S. Viguier
Anne-Laure Ligozat
84
272
0
03 Nov 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
45
41
0
29 Sep 2022
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
64
436
0
29 Jun 2022
The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink
David A. Patterson
Joseph E. Gonzalez
Urs Holzle
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
64
239
0
11 Apr 2022
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Tri Dao
Beidi Chen
N. Sohoni
Arjun D Desai
Michael Poli
Jessica Grogan
Alexander Liu
Aniruddh Rao
Atri Rudra
Christopher Ré
76
90
0
01 Apr 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
137
1,915
0
29 Mar 2022
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
67
46
0
28 Mar 2022
Benchmarking Test-Time Unsupervised Deep Neural Network Adaptation on Edge Devices
K. Bhardwaj
James Diffenderfer
B. Kailkhura
Maya Gokhale
AAML
OOD
67
3
0
21 Mar 2022
AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning
Krishnateja Killamsetty
Guttu Sai Abhishek
Aakriti
A. Evfimievski
Lucian Popa
Ganesh Ramakrishnan
Rishabh K. Iyer
31
25
0
15 Mar 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
64
155
0
07 Mar 2022
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Yucheng Lu
Conglong Li
Minjia Zhang
Christopher De Sa
Yuxiong He
OffRL
AI4CE
40
21
0
12 Feb 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale
Samyam Rajbhandari
Conglong Li
Z. Yao
Minjia Zhang
Reza Yazdani Aminabadi
A. A. Awan
Jeff Rasley
Yuxiong He
61
292
0
14 Jan 2022
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng Zhang
Li Dong
Furu Wei
B. Guo
ViT
184
1,783
0
18 Nov 2021
Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters
Xiangru Lian
Binhang Yuan
Xuefeng Zhu
Yulong Wang
Yongjun He
...
Lei Yuan
Hai-bo Yu
Sen Yang
Ce Zhang
Ji Liu
VLM
47
34
0
10 Nov 2021
MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge
Geng Yuan
Xiaolong Ma
Wei Niu
Zhengang Li
Zhenglun Kong
...
Minghai Qin
Bin Ren
Yanzhi Wang
Sijia Liu
Xue Lin
45
93
0
26 Oct 2021
The Efficiency Misnomer
Daoyuan Chen
Liuyi Yao
Dawei Gao
Ashish Vaswani
Yaliang Li
74
101
0
25 Oct 2021
8-bit Optimizers via Block-wise Quantization
Tim Dettmers
M. Lewis
Sam Shleifer
Luke Zettlemoyer
MQ
99
286
0
06 Oct 2021
ResNet strikes back: An improved training procedure in timm
Ross Wightman
Hugo Touvron
Hervé Jégou
AI4TS
230
489
0
01 Oct 2021
Primer: Searching for Efficient Transformers for Language Modeling
David R. So
Wojciech Mañke
Hanxiao Liu
Zihang Dai
Noam M. Shazeer
Quoc V. Le
VLM
203
154
0
17 Sep 2021
Deep Learning on a Data Diet: Finding Important Examples Early in Training
Mansheej Paul
Surya Ganguli
Gintare Karolina Dziugaite
93
446
0
15 Jul 2021
Physics-Guided Deep Learning for Dynamical Systems: A Survey
Rui Wang
Rose Yu
AI4CE
PINN
72
67
0
02 Jul 2021
Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
Shiwei Liu
Tianlong Chen
Zahra Atashgahi
Xiaohan Chen
Ghada Sokar
Elena Mocanu
Mykola Pechenizkiy
Zhangyang Wang
Decebal Constantin Mocanu
OOD
47
52
0
28 Jun 2021
Lossy Compression for Lossless Prediction
Yann Dubois
Benjamin Bloem-Reddy
Karen Ullrich
Chris J. Maddison
68
60
0
21 Jun 2021
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Andreas Steiner
Alexander Kolesnikov
Xiaohua Zhai
Ross Wightman
Jakob Uszkoreit
Lucas Beyer
ViT
101
623
0
18 Jun 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
119
836
0
14 Jun 2021
CoAtNet: Marrying Convolution and Attention for All Data Sizes
Zihang Dai
Hanxiao Liu
Quoc V. Le
Mingxing Tan
ViT
93
1,188
0
09 Jun 2021
Drawing Multiple Augmentation Samples Per Image During Training Efficiently Decreases Test Error
Stanislav Fort
Andrew Brock
Razvan Pascanu
Soham De
Samuel L. Smith
43
31
0
27 May 2021
Carbon Emissions and Large Neural Network Training
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
321
658
0
21 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
74
667
0
09 Apr 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
62
299
0
13 Mar 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
329
4,873
0
24 Feb 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
317
180
0
17 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
254
514
0
11 Feb 2021
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Torsten Hoefler
Dan Alistarh
Tal Ben-Nun
Nikoli Dryden
Alexandra Peste
MQ
264
703
0
31 Jan 2021
Bottleneck Transformers for Visual Recognition
A. Srinivas
Nayeon Lee
Niki Parmar
Jonathon Shlens
Pieter Abbeel
Ashish Vaswani
SLR
327
986
0
27 Jan 2021
Clairvoyant Prefetching for Distributed Machine Learning I/O
Nikoli Dryden
Roman Böhringer
Tal Ben-Nun
Torsten Hoefler
54
57
0
21 Jan 2021
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
233
1,574
0
11 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
71
100
0
31 Dec 2020
Deconstructing the Structure of Sparse Neural Networks
M. V. Gelder
Mitchell Wortsman
Kiana Ehsani
26
1
0
30 Nov 2020
FreezeNet: Full Performance by Reduced Storage Costs
Paul Wimmer
Jens Mehnert
Alexandru Paul Condurache
47
13
0
28 Nov 2020
Long Range Arena: A Benchmark for Efficient Transformers
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
110
706
0
08 Nov 2020
μ
μ
μ
NAS: Constrained Neural Architecture Search for Microcontrollers
Edgar Liberis
Łukasz Dudziak
Nicholas D. Lane
BDL
28
104
0
27 Oct 2020
Sharpness-Aware Minimization for Efficiently Improving Generalization
Pierre Foret
Ariel Kleiner
H. Mobahi
Behnam Neyshabur
AAML
171
1,323
0
03 Oct 2020
Rethinking Attention with Performers
K. Choromanski
Valerii Likhosherstov
David Dohan
Xingyou Song
Andreea Gane
...
Afroz Mohiuddin
Lukasz Kaiser
David Belanger
Lucy J. Colwell
Adrian Weller
148
1,548
0
30 Sep 2020
The Computational Limits of Deep Learning
Neil C. Thompson
Kristjan Greenewald
Keeheon Lee
Gabriel F. Manso
VLM
46
518
0
10 Jul 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
132
1,734
0
29 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
179
1,678
0
08 Jun 2020
1
2
Next