Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.00409
Cited By
Deep Learning Scaling is Predictable, Empirically
1 December 2017
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Learning Scaling is Predictable, Empirically"
50 / 372 papers shown
Title
Self-Programming Artificial Intelligence Using Code-Generating Language Models
Alex Sheng
Shankar Padmanabhan
SyDa
36
2
0
30 Apr 2022
Demonstration of Superconducting Optoelectronic Single-Photon Synapses
Saeed A. Khan
Bryce Primavera
J. Chiles
A. McCaughan
S. Buckley
...
A. Fox
D. Olaya
R. Mirin
S. Nam
J. Shainline
60
2
0
20 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
Yizhong Wang
Swaroop Mishra
Pegah Alipoormolabashi
Yeganeh Kordi
Amirreza Mirzaei
...
Chitta Baral
Yejin Choi
Noah A. Smith
Hannaneh Hajishirzi
Daniel Khashabi
ELM
137
864
0
16 Apr 2022
GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets
Johannes Gasteiger
Muhammed Shuaibi
Anuroop Sriram
Stephan Günnemann
Zachary W. Ulissi
C. L. Zitnick
Abhishek Das
AI4TS
MLAU
123
70
0
06 Apr 2022
Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels
Jiwon Kim
Kwang-seok Ryoo
Junyoung Seo
Gyuseong Lee
Daehwan Kim
Hansang Cho
Seung Wook Kim
90
26
0
30 Mar 2022
Time Dependency, Data Flow, and Competitive Advantage
E. Valavi
Joel Hestness
M. Iansiti
Newsha Ardalani
Feng Zhu
K. Lakhani
AI4TS
38
1
0
17 Mar 2022
Time and the Value of Data
E. Valavi
Joel Hestness
Newsha Ardalani
M. Iansiti
AIFin
61
20
0
17 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
93
110
0
14 Mar 2022
Interpolation-based Contrastive Learning for Few-Label Semi-Supervised Learning
Xihong Yang
Xiaochang Hu
Sihang Zhou
Xinwang Liu
En Zhu
SSL
329
44
0
24 Feb 2022
Mixture-of-Experts with Expert Choice Routing
Yan-Quan Zhou
Tao Lei
Han-Chu Liu
Nan Du
Yanping Huang
Vincent Zhao
Andrew M. Dai
Zhifeng Chen
Quoc V. Le
James Laudon
MoE
321
377
0
18 Feb 2022
Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness
Kate Donahue
Alexandra Chouldechova
K. Kenthapadi
FaML
FedML
112
54
0
17 Feb 2022
Compute Trends Across Three Eras of Machine Learning
J. Sevilla
Lennart Heim
A. Ho
T. Besiroglu
Marius Hobbhahn
Pablo Villalobos
116
279
0
11 Feb 2022
Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data
Umberto M. Tomasini
Antonio Sclocchi
Matthieu Wyart
74
12
0
07 Feb 2022
DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization
Alexander Tyurin
Peter Richtárik
119
19
0
02 Feb 2022
Reducing the Amount of Real World Data for Object Detector Training with Synthetic Data
Sven Burdorf
Karoline Plum
Daniel Hasenklever
35
4
0
31 Jan 2022
Error Scaling Laws for Kernel Classification under Source and Capacity Conditions
Hugo Cui
Bruno Loureiro
Florent Krzakala
Lenka Zdeborová
111
11
0
29 Jan 2022
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules
Xinfeng Xie
Prakash Prabhu
Ulysse Beaugnon
P. Phothilimthana
Sudip Roy
Azalia Mirhoseini
E. Brevdo
James Laudon
Yanqi Zhou
51
5
0
07 Dec 2021
Models of fairness in federated learning
Kate Donahue
Jon M. Kleinberg
FedML
108
11
0
01 Dec 2021
AugLiChem: Data Augmentation Library of Chemical Structures for Machine Learning
Rishikesh Magar
Yuyang Wang
Cooper Lorsung
Chen Liang
Hariharan Ramasubramanian
Peiyuan Li
A. Farimani
98
33
0
30 Nov 2021
Turing-Universal Learners with Optimal Scaling Laws
Preetum Nakkiran
78
2
0
09 Nov 2021
Learning curves for Gaussian process regression with power-law priors and targets
Hui Jin
P. Banerjee
Guido Montúfar
72
18
0
23 Oct 2021
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail
Sam Bowman
OffRL
117
45
0
15 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations
Xinyu Zhang
Ian Colbert
Ken Kreutz-Delgado
Srinjoy Das
MQ
100
12
0
15 Oct 2021
Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers
Gabriele Prato
Simon Guiroy
Ethan Caballero
Irina Rish
Sarath Chandar
VLM
94
12
0
13 Oct 2021
RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss
Trung Q. Tran
Mingu Kang
Daeyoung Kim
35
2
0
09 Oct 2021
Unsupervised Selective Labeling for More Effective Semi-Supervised Learning
Xudong Wang
Long Lian
Stella X. Yu
265
32
0
06 Oct 2021
Max and Coincidence Neurons in Neural Networks
Albert Lee
Kang L. Wang
16
1
0
04 Oct 2021
Unsolved Problems in ML Safety
Dan Hendrycks
Nicholas Carlini
John Schulman
Jacob Steinhardt
289
294
0
28 Sep 2021
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani
Orhan Firat
Markus Freitag
Ankur Bapna
M. Krikun
Xavier Garcia
Ciprian Chelba
Colin Cherry
90
103
0
16 Sep 2021
Formalizing and Estimating Distribution Inference Risks
Anshuman Suri
David Evans
MIACV
110
52
0
13 Sep 2021
Compute and Energy Consumption Trends in Deep Learning Inference
Radosvet Desislavov
Fernando Martínez-Plumed
José Hernández-Orallo
77
119
0
12 Sep 2021
Why and How Governments Should Monitor AI Development
Jess Whittlestone
Jack Clark
47
31
0
28 Aug 2021
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?
Hiroaki Mikami
Kenji Fukumizu
Shogo Murai
Shuji Suzuki
Yuta Kikuchi
Taiji Suzuki
S. Maeda
Kohei Hayashi
92
12
0
25 Aug 2021
Scalable Bayesian transport maps for high-dimensional non-Gaussian spatial fields
Matthias Katzfuss
Florian Schafer
OT
118
14
0
09 Aug 2021
On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models
Z. Emam
Andrew Kondrich
Sasha Harrison
Felix Lau
Yushi Wang
Aerin Kim
E. Branson
VLM
50
13
0
31 Jul 2021
Dataset Distillation with Infinitely Wide Convolutional Networks
Timothy Nguyen
Roman Novak
Lechao Xiao
Jaehoon Lee
DD
109
237
0
27 Jul 2021
Learning to Limit Data Collection via Scaling Laws: A Computational Interpretation for the Legal Principle of Data Minimization
Divya Shanmugam
Samira Shabanian
Fernando Diaz
Michèle Finck
Joanna Biega
75
22
0
16 Jul 2021
A Dual-Purpose Deep Learning Model for Auscultated Lung and Tracheal Sound Analysis Based on Mixed Set Training
Fu-Shun Hsu
Shang-Ran Huang
Chang-Fu Su
Chien-Wen Huang
Yuan-Ren Cheng
...
Nian-Jhen Lin
Wan-Ling Tsai
Ching-Shiang Lu
Chuan Chen
F. Lai
57
5
0
09 Jul 2021
Structured Model Pruning of Convolutional Networks on Tensor Processing Units
Kongtao Chen
Ken Franko
Ruoxin Sang
CVBM
38
59
0
09 Jul 2021
Meta-learning Amidst Heterogeneity and Ambiguity
Kyeongryeol Go
Seyoung Yun
91
1
0
05 Jul 2021
Hi-BEHRT: Hierarchical Transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records
Yikuan Li
M. Mamouei
G. Salimi-Khorshidi
Shishir Rao
A. Hassaine
D. Canoy
Thomas Lukasiewicz
K. Rahimi
96
82
0
21 Jun 2021
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios
Alessandro Favero
Francesco Cagnetta
Matthieu Wyart
102
31
0
16 Jun 2021
Accelerating Sparse Deep Neural Networks
Asit K. Mishra
J. Latorre
Jeff Pool
Darko Stosic
Dusan Stosic
Ganesh Venkatesh
Chong Yu
Paulius Micikevicius
167
237
0
16 Apr 2021
Scaling Scaling Laws with Board Games
Andrew Jones
62
43
0
07 Apr 2021
Vulnerability Due to Training Order in Split Learning
Harshit Madaan
M. Gawali
V. Kulkarni
Aniruddha Pant
FedML
74
6
0
26 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
108
140
0
24 Mar 2021
The Shape of Learning Curves: a Review
T. Viering
Marco Loog
80
135
0
19 Mar 2021
The Low-Rank Simplicity Bias in Deep Networks
Minyoung Huh
H. Mobahi
Richard Y. Zhang
Brian Cheung
Pulkit Agrawal
Phillip Isola
117
116
0
18 Mar 2021
Is it enough to optimize CNN architectures on ImageNet?
Lukas Tuggener
Jürgen Schmidhuber
Thilo Stadelmann
86
23
0
16 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies
Irwan Bello
W. Fedus
Xianzhi Du
E. D. Cubuk
A. Srinivas
Nayeon Lee
Jonathon Shlens
Barret Zoph
98
302
0
13 Mar 2021
Previous
1
2
3
4
5
6
7
8
Next