Deep Learning Scaling is Predictable, Empirically

1 December 2017

Sharan Narang

Md. Mostofa Ali Patwary

Yang Yang

Yanqi Zhou

ArXiv (abs)PDF HTML

Papers citing "Deep Learning Scaling is Predictable, Empirically"

50 / 372 papers shown

Title
Self-Programming Artificial Intelligence Using Code-Generating Language Models Alex Sheng Shankar Padmanabhan SyDa 36 2 0 30 Apr 2022
Demonstration of Superconducting Optoelectronic Single-Photon Synapses Saeed A. Khan Bryce Primavera J. Chiles A. McCaughan S. Buckley ... A. Fox D. Olaya R. Mirin S. Nam J. Shainline 60 2 0 20 Apr 2022
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks Yizhong Wang Swaroop Mishra Pegah Alipoormolabashi Yeganeh Kordi Amirreza Mirzaei ... Chitta Baral Yejin Choi Noah A. Smith Hannaneh Hajishirzi Daniel Khashabi ELM 137 864 0 16 Apr 2022
GemNet-OC: Developing Graph Neural Networks for Large and Diverse Molecular Simulation Datasets Johannes Gasteiger Muhammed Shuaibi Anuroop Sriram Stephan Günnemann Zachary W. Ulissi C. L. Zitnick Abhishek Das AI4TS MLAU 123 70 0 06 Apr 2022
Semi-Supervised Learning of Semantic Correspondence with Pseudo-Labels Jiwon Kim Kwang-seok Ryoo Junyoung Seo Gyuseong Lee Daehwan Kim Hansang Cho Seung Wook Kim 90 26 0 30 Mar 2022
Time Dependency, Data Flow, and Competitive Advantage E. Valavi Joel Hestness M. Iansiti Newsha Ardalani Feng Zhu K. Lakhani AI4TS 38 1 0 17 Mar 2022
Time and the Value of Data E. Valavi Joel Hestness Newsha Ardalani M. Iansiti AIFin 61 20 0 17 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities Hsiang-Sheng Tsai Heng-Jui Chang Wen-Chin Huang Zili Huang Kushal Lakhotia ... Hsuan-Jui Chen Shang-Wen Li Shinji Watanabe Abdel-rahman Mohamed Hung-yi Lee 93 110 0 14 Mar 2022
Interpolation-based Contrastive Learning for Few-Label Semi-Supervised Learning Xihong Yang Xiaochang Hu Sihang Zhou Xinwang Liu En Zhu SSL 329 44 0 24 Feb 2022
Mixture-of-Experts with Expert Choice Routing Yan-Quan Zhou Tao Lei Han-Chu Liu Nan Du Yanping Huang Vincent Zhao Andrew M. Dai Zhifeng Chen Quoc V. Le James Laudon MoE 321 377 0 18 Feb 2022
Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness Kate Donahue Alexandra Chouldechova K. Kenthapadi FaML FedML 112 54 0 17 Feb 2022
Compute Trends Across Three Eras of Machine Learning J. Sevilla Lennart Heim A. Ho T. Besiroglu Marius Hobbhahn Pablo Villalobos 116 279 0 11 Feb 2022
Failure and success of the spectral bias prediction for Kernel Ridge Regression: the case of low-dimensional data Umberto M. Tomasini Antonio Sclocchi Matthieu Wyart 74 12 0 07 Feb 2022
DASHA: Distributed Nonconvex Optimization with Communication Compression, Optimal Oracle Complexity, and No Client Synchronization Alexander Tyurin Peter Richtárik 119 19 0 02 Feb 2022
Reducing the Amount of Real World Data for Object Detector Training with Synthetic Data Sven Burdorf Karoline Plum Daniel Hasenklever 35 4 0 31 Jan 2022
Error Scaling Laws for Kernel Classification under Source and Capacity Conditions Hugo Cui Bruno Loureiro Florent Krzakala Lenka Zdeborová 111 11 0 29 Jan 2022
A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules Xinfeng Xie Prakash Prabhu Ulysse Beaugnon P. Phothilimthana Sudip Roy Azalia Mirhoseini E. Brevdo James Laudon Yanqi Zhou 51 5 0 07 Dec 2021
Models of fairness in federated learning Kate Donahue Jon M. Kleinberg FedML 108 11 0 01 Dec 2021
AugLiChem: Data Augmentation Library of Chemical Structures for Machine Learning Rishikesh Magar Yuyang Wang Cooper Lorsung Chen Liang Hariharan Ramasubramanian Peiyuan Li A. Farimani 98 33 0 30 Nov 2021
Turing-Universal Learners with Optimal Scaling Laws Preetum Nakkiran 78 2 0 09 Nov 2021
Learning curves for Gaussian process regression with power-law priors and targets Hui Jin P. Banerjee Guido Montúfar 72 18 0 23 Oct 2021
The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail Sam Bowman OffRL 117 45 0 15 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations Xinyu Zhang Ian Colbert Ken Kreutz-Delgado Srinjoy Das MQ 100 12 0 15 Oct 2021
Scaling Laws for the Few-Shot Adaptation of Pre-trained Image Classifiers Gabriele Prato Simon Guiroy Ethan Caballero Irina Rish Sarath Chandar VLM 94 12 0 13 Oct 2021
RankingMatch: Delving into Semi-Supervised Learning with Consistency Regularization and Ranking Loss Trung Q. Tran Mingu Kang Daeyoung Kim 35 2 0 09 Oct 2021
Unsupervised Selective Labeling for More Effective Semi-Supervised Learning Xudong Wang Long Lian Stella X. Yu 265 32 0 06 Oct 2021
Max and Coincidence Neurons in Neural Networks Albert Lee Kang L. Wang 16 1 0 04 Oct 2021
Unsolved Problems in ML Safety Dan Hendrycks Nicholas Carlini John Schulman Jacob Steinhardt 289 294 0 28 Sep 2021
Scaling Laws for Neural Machine Translation Behrooz Ghorbani Orhan Firat Markus Freitag Ankur Bapna M. Krikun Xavier Garcia Ciprian Chelba Colin Cherry 90 103 0 16 Sep 2021
Formalizing and Estimating Distribution Inference Risks Anshuman Suri David Evans MIACV 110 52 0 13 Sep 2021
Compute and Energy Consumption Trends in Deep Learning Inference Radosvet Desislavov Fernando Martínez-Plumed José Hernández-Orallo 77 119 0 12 Sep 2021
Why and How Governments Should Monitor AI Development Jess Whittlestone Jack Clark 47 31 0 28 Aug 2021
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective? Hiroaki Mikami Kenji Fukumizu Shogo Murai Shuji Suzuki Yuta Kikuchi Taiji Suzuki S. Maeda Kohei Hayashi 92 12 0 25 Aug 2021
Scalable Bayesian transport maps for high-dimensional non-Gaussian spatial fields Matthias Katzfuss Florian Schafer OT 118 14 0 09 Aug 2021
On The State of Data In Computer Vision: Human Annotations Remain Indispensable for Developing Deep Learning Models Z. Emam Andrew Kondrich Sasha Harrison Felix Lau Yushi Wang Aerin Kim E. Branson VLM 50 13 0 31 Jul 2021
Dataset Distillation with Infinitely Wide Convolutional Networks Timothy Nguyen Roman Novak Lechao Xiao Jaehoon Lee DD 109 237 0 27 Jul 2021
Learning to Limit Data Collection via Scaling Laws: A Computational Interpretation for the Legal Principle of Data Minimization Divya Shanmugam Samira Shabanian Fernando Diaz Michèle Finck Joanna Biega 75 22 0 16 Jul 2021
A Dual-Purpose Deep Learning Model for Auscultated Lung and Tracheal Sound Analysis Based on Mixed Set Training Fu-Shun Hsu Shang-Ran Huang Chang-Fu Su Chien-Wen Huang Yuan-Ren Cheng ... Nian-Jhen Lin Wan-Ling Tsai Ching-Shiang Lu Chuan Chen F. Lai 57 5 0 09 Jul 2021
Structured Model Pruning of Convolutional Networks on Tensor Processing Units Kongtao Chen Ken Franko Ruoxin Sang CVBM 38 59 0 09 Jul 2021
Meta-learning Amidst Heterogeneity and Ambiguity Kyeongryeol Go Seyoung Yun 91 1 0 05 Jul 2021
Hi-BEHRT: Hierarchical Transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records Yikuan Li M. Mamouei G. Salimi-Khorshidi Shishir Rao A. Hassaine D. Canoy Thomas Lukasiewicz K. Rahimi 96 82 0 21 Jun 2021
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios Alessandro Favero Francesco Cagnetta Matthieu Wyart 102 31 0 16 Jun 2021
Accelerating Sparse Deep Neural Networks Asit K. Mishra J. Latorre Jeff Pool Darko Stosic Dusan Stosic Ganesh Venkatesh Chong Yu Paulius Micikevicius 167 237 0 16 Apr 2021
Scaling Scaling Laws with Board Games Andrew Jones 62 43 0 07 Apr 2021
Vulnerability Due to Training Order in Split Learning Harshit Madaan M. Gawali V. Kulkarni Aniruddha Pant FedML 74 6 0 26 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark Nicholas Lourie Ronan Le Bras Chandra Bhagavatula Yejin Choi LRM 108 140 0 24 Mar 2021
The Shape of Learning Curves: a Review T. Viering Marco Loog 80 135 0 19 Mar 2021
The Low-Rank Simplicity Bias in Deep Networks Minyoung Huh H. Mobahi Richard Y. Zhang Brian Cheung Pulkit Agrawal Phillip Isola 117 116 0 18 Mar 2021
Is it enough to optimize CNN architectures on ImageNet? Lukas Tuggener Jürgen Schmidhuber Thilo Stadelmann 86 23 0 16 Mar 2021
Revisiting ResNets: Improved Training and Scaling Strategies Irwan Bello W. Fedus Xianzhi Du E. D. Cubuk A. Srinivas Nayeon Lee Jonathon Shlens Barret Zoph 98 302 0 13 Mar 2021