ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.00409
  4. Cited By
Deep Learning Scaling is Predictable, Empirically

Deep Learning Scaling is Predictable, Empirically

1 December 2017
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
ArXiv (abs)PDFHTML

Papers citing "Deep Learning Scaling is Predictable, Empirically"

50 / 372 papers shown
Title
Self-Programming Artificial Intelligence Using Code-Generating Language
  Models
Self-Programming Artificial Intelligence Using Code-Generating Language Models
Alex Sheng
Shankar Padmanabhan
SyDa
36
2
0
30 Apr 2022
Demonstration of Superconducting Optoelectronic Single-Photon Synapses
Demonstration of Superconducting Optoelectronic Single-Photon Synapses
Saeed A. Khan
Bryce Primavera
J. Chiles
A. McCaughan
S. Buckley
...
A. Fox
D. Olaya
R. Mirin
S. Nam
J. Shainline
60
2
0
20 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions
  on 1600+ NLP Tasks
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
137
864
0
16 Apr 2022
GemNet-OC: Developing Graph Neural Networks for Large and Diverse
  Molecular Simulation Datasets
GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets
Johannes Gasteiger
Muhammed Shuaibi
Anuroop Sriram
Stephan Günnemann
Zachary W. Ulissi
C. L. Zitnick
Abhishek Das
AI4TSMLAU
123
70
0
06 Apr 2022
Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels
Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels
Jiwon Kim
Kwang-seok Ryoo
Junyoung Seo
Gyuseong Lee
Daehwan Kim
Hansang Cho
Seung Wook Kim
90
26
0
30 Mar 2022
Time Dependency, Data Flow, and Competitive Advantage
Time Dependency, Data Flow, and Competitive Advantage
E. Valavi
Joel Hestness
M. Iansiti
Newsha Ardalani
Feng Zhu
K. Lakhani
AI4TS
38
1
0
17 Mar 2022
Time and the Value of Data
Time and the Value of Data
E. Valavi
Joel Hestness
Newsha Ardalani
M. Iansiti
AIFin
61
20
0
17 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
93
110
0
14 Mar 2022
Interpolation-based Contrastive Learning for Few-Label Semi-Supervised
  Learning
Interpolation-based Contrastive Learning for Few-Label Semi-Supervised Learning
Xihong Yang
Xiaochang Hu
Sihang Zhou
Xinwang Liu
En Zhu
SSL
329
44
0
24 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
321
377
0
18 Feb 2022
Human-Algorithm Collaboration: Achieving Complementarity and Avoiding
  Unfairness
Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness
Kate Donahue
Alexandra Chouldechova
K. Kenthapadi
FaMLFedML
112
54
0
17 Feb 2022
Compute Trends Across Three Eras of Machine Learning
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
116
279
0
11 Feb 2022
Failure and success of the spectral bias prediction for Kernel Ridge
  Regression: the case of low-dimensional data
Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data
Umberto M. Tomasini
Antonio Sclocchi
Matthieu Wyart
74
12
0
07 Feb 2022
DASHA: Distributed Nonconvex Optimization with Communication
  Compression, Optimal Oracle Complexity, and No Client Synchronization
DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization
Alexander Tyurin
Peter Richtárik
119
19
0
02 Feb 2022
Reducing the Amount of Real World Data for Object Detector Training with
  Synthetic Data
Reducing the Amount of Real World Data for Object Detector Training with Synthetic Data
Sven Burdorf
Karoline Plum
Daniel Hasenklever
35
4
0
31 Jan 2022
Error Scaling Laws for Kernel Classification under Source and Capacity
  Conditions
Error Scaling Laws for Kernel Classification under Source and Capacity Conditions
Hugo Cui
Bruno Loureiro
Florent Krzakala
Lenka Zdeborová
111
11
0
29 Jan 2022
A Transferable Approach for Partitioning Machine Learning Models on
  Multi-Chip-Modules
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules
Xinfeng Xie
Prakash Prabhu
Ulysse Beaugnon
P. Phothilimthana
Sudip Roy
Azalia Mirhoseini
E. Brevdo
James Laudon
Yanqi Zhou
51
5
0
07 Dec 2021
Models of fairness in federated learning
Models of fairness in federated learning
Kate Donahue
Jon M. Kleinberg
FedML
108
11
0
01 Dec 2021
AugLiChem: Data Augmentation Library of Chemical Structures for Machine
  Learning
AugLiChem: Data Augmentation Library of Chemical Structures for Machine Learning
Rishikesh Magar
Yuyang Wang
Cooper Lorsung
Chen Liang
Hariharan Ramasubramanian
Peiyuan Li
A. Farimani
98
33
0
30 Nov 2021
Turing-Universal Learners with Optimal Scaling Laws
Turing-Universal Learners with Optimal Scaling Laws
Preetum Nakkiran
78
2
0
09 Nov 2021
Learning curves for Gaussian process regression with power-law priors
  and targets
Learning curves for Gaussian process regression with power-law priors and targets
Hui Jin
P. Banerjee
Guido Montúfar
72
18
0
23 Oct 2021
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP
  Systems Fail
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Sam Bowman
OffRL
117
45
0
15 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of
  Weights and Activations
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
100
12
0
15 Oct 2021
Scaling Laws for the Few-Shot Adaptation of Pre-trained Image
  Classifiers
Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers
Gabriele Prato
Simon Guiroy
Ethan Caballero
Irina Rish
Sarath Chandar
VLM
94
12
0
13 Oct 2021
RankingMatch: Delving into Semi-Supervised Learning with Consistency
  Regularization and Ranking Loss
RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss
Trung Q. Tran
Mingu Kang
Daeyoung Kim
35
2
0
09 Oct 2021
Unsupervised Selective Labeling for More Effective Semi-Supervised
  Learning
Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
Xudong Wang
Long Lian
Stella X. Yu
265
32
0
06 Oct 2021
Max and Coincidence Neurons in Neural Networks
Max and Coincidence Neurons in Neural Networks
Albert Lee
Kang L. Wang
16
1
0
04 Oct 2021
Unsolved Problems in ML Safety
Unsolved Problems in ML Safety
Dan Hendrycks
Nicholas Carlini
John Schulman
Jacob Steinhardt
289
294
0
28 Sep 2021
Scaling Laws for Neural Machine Translation
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani
Orhan Firat
Markus Freitag
Ankur Bapna
M. Krikun
Xavier Garcia
Ciprian Chelba
Colin Cherry
90
103
0
16 Sep 2021
Formalizing and Estimating Distribution Inference Risks
Formalizing and Estimating Distribution Inference Risks
Anshuman Suri
David Evans
MIACV
110
52
0
13 Sep 2021
Compute and Energy Consumption Trends in Deep Learning Inference
Compute and Energy Consumption Trends in Deep Learning Inference
Radosvet Desislavov
Fernando Martínez-Plumed
José Hernández-Orallo
77
119
0
12 Sep 2021
Why and How Governments Should Monitor AI Development
Why and How Governments Should Monitor AI Development
Jess Whittlestone
Jack Clark
47
31
0
28 Aug 2021
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your
  Pre-training Effective?
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?
Hiroaki Mikami
Kenji Fukumizu
Shogo Murai
Shuji Suzuki
Yuta Kikuchi
Taiji Suzuki
S. Maeda
Kohei Hayashi
92
12
0
25 Aug 2021
Scalable Bayesian transport maps for high-dimensional non-Gaussian
  spatial fields
Scalable Bayesian transport maps for high-dimensional non-Gaussian spatial fields
Matthias Katzfuss
Florian Schafer
OT
118
14
0
09 Aug 2021
On The State of Data In Computer Vision: Human Annotations Remain
  Indispensable for Developing Deep Learning Models
On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models
Z. Emam
Andrew Kondrich
Sasha Harrison
Felix Lau
Yushi Wang
Aerin Kim
E. Branson
VLM
50
13
0
31 Jul 2021
Dataset Distillation with Infinitely Wide Convolutional Networks
Dataset Distillation with Infinitely Wide Convolutional Networks
Timothy Nguyen
Roman Novak
Lechao Xiao
Jaehoon Lee
DD
109
237
0
27 Jul 2021
Learning to Limit Data Collection via Scaling Laws: A Computational
  Interpretation for the Legal Principle of Data Minimization
Learning to Limit Data Collection via Scaling Laws: A Computational Interpretation for the Legal Principle of Data Minimization
Divya Shanmugam
Samira Shabanian
Fernando Diaz
Michèle Finck
Joanna Biega
75
22
0
16 Jul 2021
A Dual-Purpose Deep Learning Model for Auscultated Lung and Tracheal
  Sound Analysis Based on Mixed Set Training
A Dual-Purpose Deep Learning Model for Auscultated Lung and Tracheal Sound Analysis Based on Mixed Set Training
Fu-Shun Hsu
Shang-Ran Huang
Chang-Fu Su
Chien-Wen Huang
Yuan-Ren Cheng
...
Nian-Jhen Lin
Wan-Ling Tsai
Ching-Shiang Lu
Chuan Chen
F. Lai
57
5
0
09 Jul 2021
Structured Model Pruning of Convolutional Networks on Tensor Processing
  Units
Structured Model Pruning of Convolutional Networks on Tensor Processing Units
Kongtao Chen
Ken Franko
Ruoxin Sang
CVBM
38
59
0
09 Jul 2021
Meta-learning Amidst Heterogeneity and Ambiguity
Meta-learning Amidst Heterogeneity and Ambiguity
Kyeongryeol Go
Seyoung Yun
91
1
0
05 Jul 2021
Hi-BEHRT: Hierarchical Transformer-based model for accurate prediction
  of clinical events using multimodal longitudinal electronic health records
Hi-BEHRT: Hierarchical Transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records
Yikuan Li
M. Mamouei
G. Salimi-Khorshidi
Shishir Rao
A. Hassaine
D. Canoy
Thomas Lukasiewicz
K. Rahimi
96
82
0
21 Jun 2021
Locality defeats the curse of dimensionality in convolutional
  teacher-student scenarios
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios
Alessandro Favero
Francesco Cagnetta
Matthieu Wyart
102
31
0
16 Jun 2021
Accelerating Sparse Deep Neural Networks
Accelerating Sparse Deep Neural Networks
Asit K. Mishra
J. Latorre
Jeff Pool
Darko Stosic
Dusan Stosic
Ganesh Venkatesh
Chong Yu
Paulius Micikevicius
167
237
0
16 Apr 2021
Scaling Scaling Laws with Board Games
Scaling Scaling Laws with Board Games
Andrew Jones
62
43
0
07 Apr 2021
Vulnerability Due to Training Order in Split Learning
Vulnerability Due to Training Order in Split Learning
Harshit Madaan
M. Gawali
V. Kulkarni
Aniruddha Pant
FedML
74
6
0
26 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New
  Multitask Benchmark
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
108
140
0
24 Mar 2021
The Shape of Learning Curves: a Review
The Shape of Learning Curves: a Review
T. Viering
Marco Loog
80
135
0
19 Mar 2021
The Low-Rank Simplicity Bias in Deep Networks
The Low-Rank Simplicity Bias in Deep Networks
Minyoung Huh
H. Mobahi
Richard Y. Zhang
Brian Cheung
Pulkit Agrawal
Phillip Isola
117
116
0
18 Mar 2021
Is it enough to optimize CNN architectures on ImageNet?
Is it enough to optimize CNN architectures on ImageNet?
Lukas Tuggener
Jürgen Schmidhuber
Thilo Stadelmann
86
23
0
16 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
98
302
0
13 Mar 2021
Previous
12345678
Next