ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.00409
  4. Cited By
Deep Learning Scaling is Predictable, Empirically

Deep Learning Scaling is Predictable, Empirically

1 December 2017
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
ArXivPDFHTML

Papers citing "Deep Learning Scaling is Predictable, Empirically"

50 / 386 papers shown
Title
Reproducible scaling laws for contrastive language-image learning
Reproducible scaling laws for contrastive language-image learning
Mehdi Cherti
Romain Beaumont
Ross Wightman
Mitchell Wortsman
Gabriel Ilharco
Cade Gordon
Christoph Schuhmann
Ludwig Schmidt
J. Jitsev
VLM
CLIP
59
743
0
14 Dec 2022
Dist-PU: Positive-Unlabeled Learning from a Label Distribution
  Perspective
Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective
Yunrui Zhao
Qianqian Xu
Yangbangyan Jiang
Peisong Wen
Qingming Huang
30
38
0
06 Dec 2022
Languages You Know Influence Those You Learn: Impact of Language
  Characteristics on Multi-Lingual Text-to-Text Transfer
Languages You Know Influence Those You Learn: Impact of Language Characteristics on Multi-Lingual Text-to-Text Transfer
Benjamin Muller
Deepanshu Gupta
Siddharth Patwardhan
J. Fauconnier
David Vandyke
Sachin Agarwal
41
5
0
04 Dec 2022
Matching DNN Compression and Cooperative Training with Resources and
  Data Availability
Matching DNN Compression and Cooperative Training with Resources and Data Availability
F. Malandrino
G. Giacomo
Armin Karamzade
Marco Levorato
C. Chiasserini
45
9
0
02 Dec 2022
Understanding BLOOM: An empirical study on diverse NLP tasks
Understanding BLOOM: An empirical study on diverse NLP tasks
Parag Dakle
Sai Krishna Rallabandi
Preethi Raghavan
AI4CE
39
3
0
27 Nov 2022
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not
  Lead to Better Performance
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance
Marco Loog
T. Viering
26
1
0
25 Nov 2022
Power-law Scaling to Assist with Key Challenges in Artificial
  Intelligence
Power-law Scaling to Assist with Key Challenges in Artificial Intelligence
Yuval Meir
Shira Sardi
Shiri Hodassman
Karin Kisos
Itamar Ben-Noam
A. Goldental
Ido Kanter
30
16
0
15 Nov 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
118
2,315
0
09 Nov 2022
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert
  Denoisers
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji
Seungjun Nah
Xun Huang
Arash Vahdat
Jiaming Song
...
Timo Aila
S. Laine
Bryan Catanzaro
Tero Karras
Xuan Li
VLM
MoE
76
804
0
02 Nov 2022
A Solvable Model of Neural Scaling Laws
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
41
51
0
30 Oct 2022
Broken Neural Scaling Laws
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
30
74
0
26 Oct 2022
Scaling Laws Beyond Backpropagation
Scaling Laws Beyond Backpropagation
Matthew J. Filipovich
Alessandro Cappelli
Daniel Hesslow
Julien Launay
19
3
0
26 Oct 2022
Á net for everyone': fully personalized and unsupervised neural
  networks trained with longitudinal data from a single patient
Á net for everyone': fully personalized and unsupervised neural networks trained with longitudinal data from a single patient
Christian Strack
Kelsey L. Pomykala
H. Schlemmer
Jan Egger
Jens Kleesiek
20
3
0
25 Oct 2022
Precision Machine Learning
Precision Machine Learning
Eric J. Michaud
Ziming Liu
Max Tegmark
24
34
0
24 Oct 2022
Active Learning from the Web
Active Learning from the Web
Ryoma Sato
27
0
0
15 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies
Meta-Principled Family of Hyperparameter Scaling Strategies
Sho Yaida
58
16
0
10 Oct 2022
A Generalizable Artificial Intelligence Model for COVID-19
  Classification Task Using Chest X-ray Radiographs: Evaluated Over Four
  Clinical Datasets with 15,097 Patients
A Generalizable Artificial Intelligence Model for COVID-19 Classification Task Using Chest X-ray Radiographs: Evaluated Over Four Clinical Datasets with 15,097 Patients
Ran Zhang
Xin Tie
John W. Garrett
D. Griner
Z. Qi
N. Bevins
S. Reeder
Guangfeng Chen
OOD
34
2
0
04 Oct 2022
Optimizing Data Collection for Machine Learning
Optimizing Data Collection for Machine Learning
Rafid Mahmood
James Lucas
J. Álvarez
Sanja Fidler
M. Law
96
26
0
03 Oct 2022
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann
C. Gros
32
26
0
29 Sep 2022
Scaling Laws For Deep Learning Based Image Reconstruction
Scaling Laws For Deep Learning Based Image Reconstruction
Tobit Klug
Reinhard Heckel
65
12
0
27 Sep 2022
Local Grammar-Based Coding Revisited
Local Grammar-Based Coding Revisited
L. Debowski
33
0
0
27 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
82
31
0
14 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
159
103
0
13 Sep 2022
Concept-Based Explanations for Tabular Data
Concept-Based Explanations for Tabular Data
Varsha Pendyala
Jihye Choi
FaML
XAI
FAtt
26
2
0
13 Sep 2022
Self-Supervised Pretraining for 2D Medical Image Segmentation
Self-Supervised Pretraining for 2D Medical Image Segmentation
András Kalapos
Bálint Gyires-Tóth
17
31
0
01 Sep 2022
PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label
  Semi-Supervised Classification
PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label Semi-Supervised Classification
Jun Huang
Alexander Huang
Beatriz C. Guerra
Yen-Yun Yu
27
4
0
30 Aug 2022
Understanding Scaling Laws for Recommendation Models
Understanding Scaling Laws for Recommendation Models
Newsha Ardalani
Carole-Jean Wu
Zeliang Chen
Bhargav Bhushanam
Adnan Aziz
42
28
0
17 Aug 2022
What Can Be Learnt With Wide Convolutional Neural Networks?
What Can Be Learnt With Wide Convolutional Neural Networks?
Francesco Cagnetta
Alessandro Favero
M. Wyart
MLT
41
11
0
01 Aug 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully
  Connected Neural Networks
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
Charles Edison Tripp
J. Perr-Sauer
L. Hayne
M. Lunacek
Jamil Gafur
AI4CE
28
0
0
25 Jul 2022
How Much More Data Do I Need? Estimating Requirements for Downstream
  Tasks
How Much More Data Do I Need? Estimating Requirements for Downstream Tasks
Rafid Mahmood
James Lucas
David Acuna
Daiqing Li
Jonah Philion
Jose M. Alvarez
Zhiding Yu
Sanja Fidler
M. Law
19
27
0
04 Jul 2022
Beyond neural scaling laws: beating power law scaling via data pruning
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
28
419
0
29 Jun 2022
Studying Generalization Through Data Averaging
Studying Generalization Through Data Averaging
C. Gomez-Uribe
FedML
27
0
0
28 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
M. Wyart
MLT
42
23
0
24 Jun 2022
Supervised learning of random quantum circuits via scalable neural
  networks
Supervised learning of random quantum circuits via scalable neural networks
S. Cantori
D. Vitali
S. Pilati
UQCV
11
7
0
21 Jun 2022
MACE: Higher Order Equivariant Message Passing Neural Networks for Fast
  and Accurate Force Fields
MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields
Ilyes Batatia
D. P. Kovács
G. Simm
Christoph Ortner
Gábor Csányi
47
442
0
15 Jun 2022
On Data Scaling in Masked Image Modeling
On Data Scaling in Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Yixuan Wei
Qi Dai
Han Hu
34
52
0
09 Jun 2022
Evaluating the Impact of Model Scale for Compositional Generalization in
  Semantic Parsing
Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing
Linlu Qiu
Peter Shaw
Panupong Pasupat
Tianze Shi
Jonathan Herzig
Emily Pitler
Fei Sha
Kristina Toutanova
AI4CE
LRM
35
52
0
24 May 2022
Improving Short Text Classification With Augmented Data Using GPT-3
Improving Short Text Classification With Augmented Data Using GPT-3
Salvador Balkus
Donghui Yan
36
33
0
23 May 2022
Investigating classification learning curves for automatically generated
  and labelled plant images
Investigating classification learning curves for automatically generated and labelled plant images
Michael A. Beck
C. Bidinosti
Christopher J. Henry
Manisha Ajmani
17
0
0
22 May 2022
Scaling Laws and Interpretability of Learning from Repeated Data
Scaling Laws and Interpretability of Learning from Repeated Data
Danny Hernandez
Tom B. Brown
Tom Conerly
Nova Dassarma
Dawn Drain
...
Catherine Olsson
Dario Amodei
Nicholas Joseph
Jared Kaplan
Sam McCandlish
33
112
0
21 May 2022
Self-Programming Artificial Intelligence Using Code-Generating Language
  Models
Self-Programming Artificial Intelligence Using Code-Generating Language Models
Alex Sheng
Shankar Padmanabhan
SyDa
17
2
0
30 Apr 2022
Demonstration of Superconducting Optoelectronic Single-Photon Synapses
Demonstration of Superconducting Optoelectronic Single-Photon Synapses
Saeed A. Khan
Bryce Primavera
J. Chiles
A. McCaughan
S. Buckley
...
A. Fox
D. Olaya
R. Mirin
S. Nam
J. Shainline
14
2
0
20 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions
  on 1600+ NLP Tasks
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
59
790
0
16 Apr 2022
GemNet-OC: Developing Graph Neural Networks for Large and Diverse
  Molecular Simulation Datasets
GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets
Johannes Gasteiger
Muhammed Shuaibi
Anuroop Sriram
Stephan Günnemann
Zachary W. Ulissi
C. L. Zitnick
Abhishek Das
AI4TS
MLAU
45
66
0
06 Apr 2022
Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels
Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels
Jiwon Kim
Kwang-seok Ryoo
Junyoung Seo
Gyuseong Lee
Daehwan Kim
Hansang Cho
Seung Wook Kim
16
23
0
30 Mar 2022
Time Dependency, Data Flow, and Competitive Advantage
Time Dependency, Data Flow, and Competitive Advantage
E. Valavi
Joel Hestness
M. Iansiti
Newsha Ardalani
Feng Zhu
K. Lakhani
AI4TS
16
0
0
17 Mar 2022
Time and the Value of Data
Time and the Value of Data
E. Valavi
Joel Hestness
Newsha Ardalani
M. Iansiti
AIFin
32
20
0
17 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
26
109
0
14 Mar 2022
Interpolation-based Contrastive Learning for Few-Label Semi-Supervised
  Learning
Interpolation-based Contrastive Learning for Few-Label Semi-Supervised Learning
Xihong Yang
Xiaochang Hu
Sihang Zhou
Xinwang Liu
En Zhu
SSL
187
43
0
24 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
160
329
0
18 Feb 2022
Previous
12345678
Next