ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.00409
  4. Cited By
Deep Learning Scaling is Predictable, Empirically

Deep Learning Scaling is Predictable, Empirically

1 December 2017
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
ArXiv (abs)PDFHTML

Papers citing "Deep Learning Scaling is Predictable, Empirically"

50 / 372 papers shown
Title
EvoLM: In Search of Lost Language Model Training Dynamics
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi
Fan Nie
Alexandre Alahi
James Zou
Himabindu Lakkaraju
Yilun Du
Eric P. Xing
Sham Kakade
Hanlin Zhang
52
1
0
19 Jun 2025
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Lowell Weissman
Michael Krumdick
A. Lynn Abbott
43
0
0
15 Jun 2025
Scaling Laws for Uncertainty in Deep Learning
Mattia Rosso
Simone Rossi
Giulio Franzese
Markus Heinonen
Maurizio Filippone
BDLUQCV
92
0
0
11 Jun 2025
Improved Scaling Laws in Linear Regression via Data Reuse
Licong Lin
Jingfeng Wu
Peter Bartlett
29
0
0
10 Jun 2025
Scaling Laws of Motion Forecasting and Planning -- A Technical Report
Mustafa Baniodeh
Kratarth Goel
Scott Ettinger
Carlos Fuertes
Ari Seff
...
Vinutha Kallem
Sergio Casas
Rami Al-Rfou
Benjamin Sapp
Dragomir Anguelov
33
0
0
09 Jun 2025
Models of Heavy-Tailed Mechanistic Universality
Models of Heavy-Tailed Mechanistic Universality
Liam Hodgkinson
Zhichao Wang
Michael W. Mahoney
80
1
0
04 Jun 2025
UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection
Jigang Fan
Quanlin Wu
Shengjie Luo
Liwei Wang
26
0
0
03 Jun 2025
On the Scaling of Robustness and Effectiveness in Dense Retrieval
On the Scaling of Robustness and Effectiveness in Dense Retrieval
Yu-an Liu
Ruqing Zhang
Jiafeng Guo
Maarten de Rijke
Yixing Fan
Xueqi Cheng
32
0
0
30 May 2025
Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks
Bayesian Neural Scaling Law Extrapolation with Prior-Data Fitted Networks
Dongwoo Lee
Dong Bok Lee
Steven Adriaensen
Juho Lee
Sung Ju Hwang
Frank Hutter
Seon Joo Kim
Hae Beom Lee
BDL
71
0
0
29 May 2025
Progressive Scaling Visual Object Tracking
Progressive Scaling Visual Object Tracking
Jack Hong
Shilin Yan
Zehao Xiao
Jiayin Cai
Xiaolong Jiang
Yao Hu
Henghui Ding
81
0
0
26 May 2025
Faithful Group Shapley Value
Kiljae Lee
Ziqi Liu
Weijing Tang
Yuan Zhang
TDIFedML
145
0
0
25 May 2025
A Coreset Selection of Coreset Selection Literature: Introduction and Recent Advances
A Coreset Selection of Coreset Selection Literature: Introduction and Recent Advances
Brian B. Moser
Arundhati S. Shanbhag
Stanislav Frolov
Federico Raue
Joachim Folz
Andreas Dengel
258
0
0
23 May 2025
Small Models, Smarter Learning: The Power of Joint Task Training
Small Models, Smarter Learning: The Power of Joint Task Training
C. Both
Benjamin Hoover
Hendrik Strobelt
Dmitry Krotov
Daniel Karl I. Weidele
Mauro Martino
Nima Dehmamy
16
0
0
23 May 2025
Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey
Materials Generation in the Era of Artificial Intelligence: A Comprehensive Survey
Zhixun Li
Bin Cao
Rui Jiao
Liang Wang
Ding Wang
...
Qiang Liu
Yu Rong
Liang Wang
Tong-yi Zhang
Jeffrey Xu Yu
3DVAI4CE
110
1
0
22 May 2025
LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought
LCDB 1.1: A Database Illustrating Learning Curves Are More Ill-Behaved Than Previously Thought
Cheng Yan
Felix Mohr
Tom Viering
131
0
0
21 May 2025
On the creation of narrow AI: hierarchy and nonlocality of neural network skills
On the creation of narrow AI: hierarchy and nonlocality of neural network skills
Eric J. Michaud
Asher Parker-Sartori
Max Tegmark
133
0
0
21 May 2025
Equally Critical: Samples, Targets, and Their Mappings in Datasets
Equally Critical: Samples, Targets, and Their Mappings in Datasets
Runkang Yang
Peng Sun
Xinyi Shang
Yi Tang
Tao R. Lin
37
0
0
17 May 2025
Parallel Scaling Law for Language Models
Parallel Scaling Law for Language Models
Mouxiang Chen
Binyuan Hui
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
Jianling Sun
Junyang Lin
Zhongxin Liu
MoELRM
91
2
0
15 May 2025
Superposition Yields Robust Neural Scaling
Superposition Yields Robust Neural Scaling
Yizhou Liu
Ziming Liu
Jeff Gore
MILM
133
1
0
15 May 2025
Learning curves theory for hierarchically compositional data with power-law distributed features
Learning curves theory for hierarchically compositional data with power-law distributed features
Francesco Cagnetta
Hyunmo Kang
Matthieu Wyart
137
1
0
11 May 2025
Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks
Extended Fiducial Inference for Individual Treatment Effects via Deep Neural Networks
Sehwan Kim
F. Liang
FedML
93
0
0
04 May 2025
Don't be lazy: CompleteP enables compute-efficient deep transformers
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Li
Blake Bordelon
Shane Bergsma
Cengiz Pehlevan
Boris Hanin
Joel Hestness
124
2
0
02 May 2025
Learning to Reason under Off-Policy Guidance
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRLLRM
178
17
0
21 Apr 2025
Multispectral airborne laser scanning for tree species classification: a benchmark of machine learning and deep learning algorithms
Multispectral airborne laser scanning for tree species classification: a benchmark of machine learning and deep learning algorithms
Josef Taher
Eric Hyyppä
Matti Hyyppä
Klaara Salolahti
Xiaowei Yu
...
Roope Näsi
H. Hyyti
Siiri Pyykkönen
Peilun Hu
Juha Hyyppa
59
0
0
19 Apr 2025
Evaluation Under Imperfect Benchmarks and Ratings: A Case Study in Text Simplification
Evaluation Under Imperfect Benchmarks and Ratings: A Case Study in Text Simplification
Joseph Liu
Yoonsoo Nam
Xinyue Cui
Swabha Swayamdipta
130
0
0
13 Apr 2025
Data Scaling Laws for End-to-End Autonomous Driving
Data Scaling Laws for End-to-End Autonomous Driving
Alexander Naumann
Xunjiang Gu
Tolga Dimlioglu
Mariusz Bojarski
Alperen Degirmenci
A. Popov
Devansh Bisla
Marco Pavone
Urs Muller
Boris Ivanovic
98
0
0
06 Apr 2025
Hyperflows: Pruning Reveals the Importance of Weights
Hyperflows: Pruning Reveals the Importance of Weights
Eugen Barbulescu
Antonio Alexoaie
62
0
0
06 Apr 2025
Compression Laws for Large Language Models
Compression Laws for Large Language Models
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
61
0
0
06 Apr 2025
Geometric Median Matching for Robust k-Subset Selection from Noisy Data
Geometric Median Matching for Robust k-Subset Selection from Noisy Data
Anish Acharya
Sujay Sanghavi
Alexandros G. Dimakis
Inderjit S Dhillon
AAML
185
0
0
01 Apr 2025
Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks
Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks
Fabian L. Thiemann
Thiago Reschützegger
Massimiliano Esposito
Tseden Taddese
Juan D. Olarte-Plata
Fausto Martelli
AI4CE
130
1
0
31 Mar 2025
Scaling Laws of Synthetic Data for Language Models
Scaling Laws of Synthetic Data for Language Models
Zeyu Qin
Qingxiu Dong
Xingxing Zhang
Li Dong
Xiaolong Huang
...
Hany Awadalla
Yi R. Fung
Weizhu Chen
Minhao Cheng
Furu Wei
SyDa
141
7
0
25 Mar 2025
Exploring Training and Inference Scaling Laws in Generative Retrieval
Exploring Training and Inference Scaling Laws in Generative Retrieval
Hongru Cai
Yongqi Li
Ruifeng Yuan
Wenjie Wang
Zhen Zhang
Wenjie Li
Tat-Seng Chua
77
1
0
24 Mar 2025
Improving Quantization with Post-Training Model Expansion
Improving Quantization with Post-Training Model Expansion
Giuseppe Franco
Pablo Monteagudo-Lago
Ian Colbert
Nicholas J. Fraser
Michaela Blott
MQ
107
2
0
21 Mar 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
117
3
0
17 Mar 2025
Scale Efficient Training for Large Datasets
Scale Efficient Training for Large Datasets
Qing Zhou
Junyu Gao
Qi Wang
DD
126
0
0
17 Mar 2025
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer
Yury Belousov
S. Voloshynovskiy
AAML
85
0
0
13 Mar 2025
Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo
Zachary B. Charles
Gabriel Teston
Lucio Dery
Keith Rush
Nova Fallen
Zachary Garrett
Arthur Szlam
Arthur Douillard
461
6
0
12 Mar 2025
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs
Zhongzhan Huang
Guoming Ling
Vincent S. Liang
Yupei Lin
Yandong Chen
Shanshan Zhong
Hefeng Wu
LRM
235
7
0
08 Mar 2025
Scaling Law Phenomena Across Regression Paradigms: Multiple and Kernel Approaches
Yifang Chen
Xuyang Guo
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao Song
105
3
0
03 Mar 2025
Hebbian learning the local structure of language
P. Myles Eugenio
115
0
0
03 Mar 2025
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)
Yoonsoo Nam
Seok Hyeong Lee
Clementine Domine
Yea Chan Park
Charles London
Wonyl Choi
Niclas Goring
Seungjai Lee
AI4CE
221
1
0
28 Feb 2025
(Mis)Fitting: A Survey of Scaling Laws
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
140
4
0
26 Feb 2025
Factual Inconsistency in Data-to-Text Generation Scales Exponentially with LLM Size: A Statistical Validation
Factual Inconsistency in Data-to-Text Generation Scales Exponentially with LLM Size: A Statistical Validation
Joy Mahapatra
Soumyajit Roy
Utpal Garain
HILMALM
149
0
0
17 Feb 2025
Privacy-Preserving Dataset Combination
Privacy-Preserving Dataset Combination
Keren Fuentes
Mimee Xu
Irene Chen
116
0
0
09 Feb 2025
Top Ten Challenges Towards Agentic Neural Graph Databases
Top Ten Challenges Towards Agentic Neural Graph Databases
Jiaxin Bai
Zehua Wang
Yukun Zhou
hang Yin
Weizhi Fei
...
Binhang Yuan
Wei Wang
Lei Chen
Xiaofang Zhou
Yangqiu Song
298
4
0
24 Jan 2025
Geometric Median (GM) Matching for Robust Data Pruning
Geometric Median (GM) Matching for Robust Data Pruning
Anish Acharya
Inderjit S Dhillon
Sujay Sanghavi
AAML
141
0
0
20 Jan 2025
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling
Bradley Brown
Jordan Juravsky
Ryan Ehrlich
Ronald Clark
Quoc V. Le
Christopher Ré
Azalia Mirhoseini
ALMLRM
307
331
0
03 Jan 2025
Towards Precise Scaling Laws for Video Diffusion Transformers
Towards Precise Scaling Laws for Video Diffusion Transformers
Yuanyang Yin
Yaqi Zhao
Mingwu Zheng
Ke Lin
Jiarong Ou
...
Pengfei Wan
Di Zhang
Baoqun Yin
Wentao Zhang
Kun Gai
199
3
0
03 Jan 2025
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
Manan Suri
Puneet Mathur
Franck Dernoncourt
Kanika Goswami
Ryan Rossi
Dinesh Manocha
159
5
0
14 Dec 2024
Implicit Delta Learning of High Fidelity Neural Network Potentials
Implicit Delta Learning of High Fidelity Neural Network Potentials
Stephan Thaler
Cristian Gabellini
Nikhil Shenoy
Prudencio Tossou
AI4CE
170
1
0
08 Dec 2024
12345678
Next