Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.00409
Cited By
Deep Learning Scaling is Predictable, Empirically
1 December 2017
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Learning Scaling is Predictable, Empirically"
50 / 372 papers shown
Title
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.7K
13,554
0
27 Feb 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
Ghada Sokar
Rishabh Agarwal
Pablo Samuel Castro
Utku Evci
CLL
110
100
0
24 Feb 2023
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
109
30
0
19 Feb 2023
Cliff-Learning
T. T. Wang
I. Zablotchi
Nir Shavit
Jonathan S. Rosenfeld
66
0
0
14 Feb 2023
Data pruning and neural scaling laws: fundamental limitations of score-based algorithms
Fadhel Ayed
Soufiane Hayou
111
10
0
14 Feb 2023
Ten Lessons We Have Learned in the New "Sparseland": A Short Handbook for Sparse Neural Network Researchers
Shiwei Liu
Zhangyang Wang
127
32
0
06 Feb 2023
Scaling Laws for Hyperparameter Optimization
Arlind Kadra
Maciej Janowski
Martin Wistuba
Josif Grabocka
98
10
0
01 Feb 2023
A Closer Look at Few-shot Classification Again
Xu Luo
Hao Wu
Ji Zhang
Lianli Gao
Jing Xu
Jingkuan Song
94
53
0
28 Jan 2023
Scaling Laws for Generative Mixed-Modal Language Models
Armen Aghajanyan
L. Yu
Alexis Conneau
Wei-Ning Hsu
Karen Hambardzumyan
Susan Zhang
Stephen Roller
Naman Goyal
Omer Levy
Luke Zettlemoyer
MoE
VLM
100
110
0
10 Jan 2023
The case for 4-bit precision: k-bit Inference Scaling Laws
Tim Dettmers
Luke Zettlemoyer
MQ
112
234
0
19 Dec 2022
Reproducible scaling laws for contrastive language-image learning
Mehdi Cherti
Romain Beaumont
Ross Wightman
Mitchell Wortsman
Gabriel Ilharco
Cade Gordon
Christoph Schuhmann
Ludwig Schmidt
J. Jitsev
VLM
CLIP
141
824
0
14 Dec 2022
Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective
Yunrui Zhao
Qianqian Xu
Yangbangyan Jiang
Peisong Wen
Qingming Huang
71
40
0
06 Dec 2022
Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer
Benjamin Muller
Deepanshu Gupta
Siddharth Patwardhan
J. Fauconnier
David Vandyke
Sachin Agarwal
94
5
0
04 Dec 2022
Matching DNN Compression and Cooperative Training with Resources and Data Availability
F. Malandrino
G. Giacomo
Armin Karamzade
Marco Levorato
C. Chiasserini
108
9
0
02 Dec 2022
Understanding BLOOM: An empirical study on diverse NLP tasks
Parag Dakle
Sai Krishna Rallabandi
Preethi Raghavan
AI4CE
91
4
0
27 Nov 2022
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance
Marco Loog
T. Viering
74
1
0
25 Nov 2022
Power-law Scaling to Assist with Key Challenges in Artificial Intelligence
Yuval Meir
Shira Sardi
Shiri Hodassman
Karin Kisos
Itamar Ben-Noam
A. Goldental
Ido Kanter
101
16
0
15 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
486
2,401
0
09 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
219
832
0
02 Nov 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
126
57
0
30 Oct 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
151
76
0
26 Oct 2022
Scaling Laws Beyond Backpropagation
Matthew J. Filipovich
Alessandro Cappelli
Daniel Hesslow
Julien Launay
55
3
0
26 Oct 2022
Á net for everyone': fully personalized and unsupervised neural networks trained with longitudinal data from a single patient
Christian Strack
Kelsey L. Pomykala
H. Schlemmer
Jan Egger
Jens Kleesiek
52
3
0
25 Oct 2022
Precision Machine Learning
Eric J. Michaud
Ziming Liu
Max Tegmark
89
37
0
24 Oct 2022
Active Learning from the Web
Ryoma Sato
40
0
0
15 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies
Sho Yaida
111
16
0
10 Oct 2022
A Generalizable Artificial Intelligence Model for COVID-19 Classification Task Using Chest X-ray Radiographs: Evaluated Over Four Clinical Datasets with 15,097 Patients
Ran Zhang
Xin Tie
John W. Garrett
D. Griner
Z. Qi
N. Bevins
S. Reeder
Guangfeng Chen
OOD
70
2
0
04 Oct 2022
Optimizing Data Collection for Machine Learning
Rafid Mahmood
James Lucas
J. Álvarez
Sanja Fidler
M. Law
165
28
0
03 Oct 2022
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann
C. Gros
92
27
0
29 Sep 2022
Scaling Laws For Deep Learning Based Image Reconstruction
Tobit Klug
Reinhard Heckel
122
13
0
27 Sep 2022
Local Grammar-Based Coding Revisited
L. Debowski
83
0
0
27 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
160
32
0
14 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
235
111
0
13 Sep 2022
Concept-Based Explanations for Tabular Data
Varsha Pendyala
Jihye Choi
FaML
XAI
FAtt
74
3
0
13 Sep 2022
Self-Supervised Pretraining for 2D Medical Image Segmentation
András Kalapos
Bálint Gyires-Tóth
104
30
0
01 Sep 2022
PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label Semi-Supervised Classification
Jun Huang
Alexander Huang
Beatriz C. Guerra
Yen-Yun Yu
61
5
0
30 Aug 2022
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
93
31
0
17 Aug 2022
What Can Be Learnt With Wide Convolutional Neural Networks?
Francesco Cagnetta
Alessandro Favero
Matthieu Wyart
MLT
149
12
0
01 Aug 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
Charles Edison Tripp
J. Perr-Sauer
L. Hayne
M. Lunacek
Jamil Gafur
AI4CE
97
1
0
25 Jul 2022
How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Rafid Mahmood
James Lucas
David Acuna
Daiqing Li
Jonah Philion
Jose M. Alvarez
Zhiding Yu
Sanja Fidler
M. Law
61
28
0
04 Jul 2022
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
130
448
0
29 Jun 2022
Studying Generalization Through Data Averaging
C. Gomez-Uribe
FedML
138
0
0
28 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
Matthieu Wyart
MLT
103
26
0
24 Jun 2022
Supervised learning of random quantum circuits via scalable neural networks
S. Cantori
D. Vitali
S. Pilati
UQCV
20
8
0
21 Jun 2022
MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields
Ilyes Batatia
D. P. Kovács
G. Simm
Christoph Ortner
Gábor Csányi
97
506
0
15 Jun 2022
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
100
57
0
09 Jun 2022
Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Linlu Qiu
Peter Shaw
Panupong Pasupat
Tianze Shi
Jonathan Herzig
Emily Pitler
Fei Sha
Kristina Toutanova
AI4CE
LRM
162
54
0
24 May 2022
Improving Short Text Classification With Augmented Data Using GPT-3
Salvador Balkus
Donghui Yan
61
37
0
23 May 2022
Investigating classification learning curves for automatically generated and labelled plant images
Michael A. Beck
C. Bidinosti
Christopher J. Henry
Manisha Ajmani
28
0
0
22 May 2022
Scaling Laws and Interpretability of Learning from Repeated Data
Danny Hernandez
Tom B. Brown
Tom Conerly
Nova Dassarma
Dawn Drain
...
Catherine Olsson
Dario Amodei
Nicholas Joseph
Jared Kaplan
Sam McCandlish
90
118
0
21 May 2022
Previous
1
2
3
4
5
6
7
8
Next