Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.10350
Cited By
Carbon Emissions and Large Neural Network Training
21 April 2021
David A. Patterson
Joseph E. Gonzalez
Quoc V. Le
Chen Liang
Lluís-Miquel Munguía
D. Rothchild
David R. So
Maud Texier
J. Dean
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Carbon Emissions and Large Neural Network Training"
50 / 126 papers shown
Title
Learning a Consensus Sub-Network with Polarization Regularization and One Pass Training
Xiaoying Zhi
Varun Babbar
P. Sun
Fran Silavong
Ruibo Shi
Sean J. Moran
Sean Moran
42
1
0
17 Feb 2023
SplitOut: Out-of-the-Box Training-Hijacking Detection in Split Learning via Outlier Detection
Ege Erdogan
Unat Teksen
Mehmet Salih Celiktenyildiz
Alptekin Kupcu
A. E. Cicek
46
4
0
16 Feb 2023
Counting Carbon: A Survey of Factors Influencing the Emissions of Machine Learning
A. Luccioni
Alex Hernandez-Garcia
31
45
0
16 Feb 2023
Energy Efficiency of Training Neural Network Architectures: An Empirical Study
Yi Xu
Silverio Martínez-Fernández
Matias Martinez
Xavier Franch
20
13
0
02 Feb 2023
Learning Reservoir Dynamics with Temporal Self-Modulation
Yusuke Sakemi
S. Nobukawa
Toshitaka Matsuki
Takashi Morie
Kazuyuki Aihara
19
6
0
23 Jan 2023
Quantum-inspired tensor network for Earth science
Soronzonbold Otgonbaatar
Dieter Kranzlmüller
PINN
AI4CE
28
4
0
15 Jan 2023
Balance is Essence: Accelerating Sparse Training via Adaptive Gradient Correction
Bowen Lei
Dongkuan Xu
Ruqi Zhang
Shuren He
Bani Mallick
32
6
0
09 Jan 2023
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model
Yeskendir Koishekenov
Alexandre Berard
Vassilina Nikoulina
MoE
35
29
0
19 Dec 2022
Natural Language to Code Generation in Interactive Data Science Notebooks
Pengcheng Yin
Wen-Ding Li
Kefan Xiao
Abhishek Rao
Yeming Wen
...
Paige Bailey
Michele Catasta
Henryk Michalewski
Oleksandr Polozov
Charles Sutton
33
57
0
19 Dec 2022
Dual adaptive training of photonic neural networks
Ziyang Zheng
Zhengyang Duan
Hang Chen
Rui Yang
Sheng Gao
Haiou Zhang
H. Xiong
Xing Lin
19
30
0
09 Dec 2022
BudgetLongformer: Can we Cheaply Pretrain a SotA Legal Language Model From Scratch?
Joel Niklaus
Daniele Giofré
30
11
0
30 Nov 2022
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang
Bowen Lei
Dongkuan Xu
Hongwu Peng
Yue Sun
Mimi Xie
Caiwen Ding
26
19
0
30 Nov 2022
The European AI Liability Directives -- Critique of a Half-Hearted Approach and Lessons for the Future
P. Hacker
AILaw
26
59
0
25 Nov 2022
Convolutional Neural Generative Coding: Scaling Predictive Coding to Natural Images
Alexander Ororbia
A. Mali
BDL
30
10
0
22 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
116
2,310
0
09 Nov 2022
Mask More and Mask Later: Efficient Pre-training of Masked Language Models by Disentangling the [MASK] Token
Baohao Liao
David Thulke
Sanjika Hewavitharana
Hermann Ney
Christof Monz
30
9
0
09 Nov 2022
Astronomia ex machina: a history, primer, and outlook on neural networks in astronomy
Michael J. Smith
James E. Geach
35
32
0
07 Nov 2022
Desiderata for next generation of ML model serving
Sherif Akoush
Andrei Paleyes
A. V. Looveren
Clive Cox
33
5
0
26 Oct 2022
Revisiting Softmax for Uncertainty Approximation in Text Classification
Andreas Nugaard Holm
Dustin Wright
Isabelle Augenstein
BDL
UQCV
16
8
0
25 Oct 2022
On the Adversarial Robustness of Mixture of Experts
J. Puigcerver
Rodolphe Jenatton
C. Riquelme
Pranjal Awasthi
Srinadh Bhojanapalli
OOD
AAML
MoE
37
18
0
19 Oct 2022
Mass-Editing Memory in a Transformer
Kevin Meng
Arnab Sen Sharma
A. Andonian
Yonatan Belinkov
David Bau
KELM
VLM
33
525
0
13 Oct 2022
Decoupled Context Processing for Context Augmented Language Modeling
Zonglin Li
Ruiqi Guo
Surinder Kumar
RALM
KELM
21
23
0
11 Oct 2022
Ecovisor: A Virtual Energy System for Carbon-Efficient Applications
Abel Souza
Noman Bashir
Jorge R. Murillo
Walid A. Hanafy
Qianlin Liang
David Irwin
Prashant J. Shenoy
53
54
0
10 Oct 2022
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
250
1,073
0
05 Oct 2022
PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference
Hongwu Peng
Shangli Zhou
Yukui Luo
Shijin Duan
Nuo Xu
...
Tong Geng
Ang Li
Wujie Wen
Xiaolin Xu
Caiwen Ding
26
3
0
20 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
82
31
0
14 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
Complexity-Driven CNN Compression for Resource-constrained Edge AI
Muhammad Zawish
Steven Davy
L. Abraham
33
16
0
26 Aug 2022
AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Saleh Soltan
Shankar Ananthakrishnan
Jack G. M. FitzGerald
Rahul Gupta
Wael Hamza
...
Mukund Sridhar
Fabian Triefenbach
Apurv Verma
Gokhan Tur
Premkumar Natarajan
54
82
0
02 Aug 2022
Eco2AI: carbon emissions tracking of machine learning models as the first step towards sustainable AI
S. Budennyy
V. Lazarev
N. Zakharenko
A. Korovin
Olga Plosskaya
...
Ivan V. Oseledets
I. Barsola
Ilya M. Egorov
A. Kosterina
L. Zhukov
34
90
0
31 Jul 2022
Machine Learning Model Sizes and the Parameter Gap
Pablo Villalobos
J. Sevilla
T. Besiroglu
Lennart Heim
A. Ho
Marius Hobbhahn
ALM
ELM
AI4CE
30
58
0
05 Jul 2022
Metrics reloaded: Recommendations for image analysis validation
Lena Maier-Hein
Annika Reinke
Patrick Godau
M. Tizabi
Florian Buettner
...
Aleksei Tiulpin
Sotirios A. Tsaftaris
Ben Van Calster
Gaël Varoquaux
Paul F. Jäger
32
216
0
03 Jun 2022
What Do Compressed Multilingual Machine Translation Models Forget?
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
AI4CE
42
9
0
22 May 2022
Single-Shot Optical Neural Network
Liane Bernstein
Alexander Sludds
C. Panuski
Sivan Trajtenberg‐Mills
R. Hamerly
Dirk Englund
BDL
29
44
0
18 May 2022
Adaptive Block Floating-Point for Analog Deep Learning Hardware
Ayon Basumallik
D. Bunandar
Nicholas Dronen
Nicholas Harris
Ludmila Levkova
Calvin McCarter
Lakshmi Nair
David Walter
David Widemann
11
6
0
12 May 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
59
3,488
0
02 May 2022
Resource-efficient domain adaptive pre-training for medical images
Y. Mehmood
U. I. Bajwa
Xianfang Sun
14
1
0
28 Apr 2022
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?
Thomas Wang
Adam Roberts
Daniel Hesslow
Teven Le Scao
Hyung Won Chung
Iz Beltagy
Julien Launay
Colin Raffel
31
167
0
12 Apr 2022
Parameter-efficient Model Adaptation for Vision Transformers
Xuehai He
Chunyuan Li
Pengchuan Zhang
Jianwei Yang
Qing Guo
30
84
0
29 Mar 2022
Automated Progressive Learning for Efficient Training of Vision Transformers
Changlin Li
Bohan Zhuang
Guangrun Wang
Xiaodan Liang
Xiaojun Chang
Yi Yang
28
46
0
28 Mar 2022
Token Dropping for Efficient BERT Pretraining
Le Hou
Richard Yuanzhe Pang
Dinesh Manocha
Yuexin Wu
Xinying Song
Xiaodan Song
Denny Zhou
22
42
0
24 Mar 2022
Hybrid training of optical neural networks
J. Spall
Xianxin Guo
A. Lvovsky
36
38
0
20 Mar 2022
What Did You Say? Task-Oriented Dialog Datasets Are Not Conversational!?
Alice Shoshana Jakobovits
Francesco Piccinno
Yasemin Altun
21
3
0
07 Mar 2022
Don't Be So Dense: Sparse-to-Sparse GAN Training Without Sacrificing Performance
Shiwei Liu
Yuesong Tian
Tianlong Chen
Li Shen
36
8
0
05 Mar 2022
Distilled Neural Networks for Efficient Learning to Rank
F. M. Nardini
Cosimo Rulli
Salvatore Trani
Rossano Venturini
FedML
29
16
0
22 Feb 2022
ST-MoE: Designing Stable and Transferable Sparse Expert Models
Barret Zoph
Irwan Bello
Sameer Kumar
Nan Du
Yanping Huang
J. Dean
Noam M. Shazeer
W. Fedus
MoE
24
181
0
17 Feb 2022
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers
Jiasen Lu
Ximing Lu
Youngjae Yu
Yanpeng Zhao
Mohammadreza Salehi
Aditya Kusupati
Jack Hessel
Ali Farhadi
Yejin Choi
31
207
0
07 Jan 2022
ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Shuohuan Wang
Yu Sun
Yang Xiang
Zhihua Wu
Siyu Ding
...
Tian Wu
Wei Zeng
Ge Li
Wen Gao
Haifeng Wang
ELM
39
79
0
23 Dec 2021
High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach
A. Blattmann
Dominik Lorenz
Patrick Esser
Bjorn Ommer
3DV
150
14,641
0
20 Dec 2021
Efficient Large Scale Language Modeling with Mixtures of Experts
Mikel Artetxe
Shruti Bhosale
Naman Goyal
Todor Mihaylov
Myle Ott
...
Jeff Wang
Luke Zettlemoyer
Mona T. Diab
Zornitsa Kozareva
Ves Stoyanov
MoE
61
188
0
20 Dec 2021
Previous
1
2
3
Next