Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10802
Cited By
A Neural Scaling Law from the Dimension of the Data Manifold
22 April 2020
Utkarsh Sharma
Jared Kaplan
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Neural Scaling Law from the Dimension of the Data Manifold"
20 / 20 papers shown
Title
Explaining Context Length Scaling and Bounds for Language Models
Jingzhe Shi
Qinwei Ma
Hongyi Liu
Hang Zhao
Jeng-Neng Hwang
Lei Li
LRM
86
3
0
03 Feb 2025
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
54
2
0
21 Jan 2025
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
49
15
0
12 Jun 2024
Survival of the Fittest Representation: A Case Study with Modular Addition
Xiaoman Delores Ding
Zifan Carl Guo
Eric J. Michaud
Ziming Liu
Max Tegmark
55
3
0
27 May 2024
Scaling Law for Time Series Forecasting
Jingzhe Shi
Qinwei Ma
Huan Ma
Lei Li
AI4TS
33
8
0
24 May 2024
KAN: Kolmogorov-Arnold Networks
Ziming Liu
Yixuan Wang
Sachin Vaidya
Fabian Ruehle
James Halverson
Marin Soljacic
Thomas Y. Hou
Max Tegmark
100
487
0
30 Apr 2024
Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks
Rahul Ramesh
Ekdeep Singh Lubana
Mikail Khona
Robert P. Dick
Hidenori Tanaka
CoGe
39
8
0
21 Nov 2023
Precision Machine Learning
Eric J. Michaud
Ziming Liu
Max Tegmark
24
34
0
24 Oct 2022
Scaling Laws for Reward Model Overoptimization
Leo Gao
John Schulman
Jacob Hilton
ALM
41
493
0
19 Oct 2022
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas
Yamini Bansal
Preetum Nakkiran
34
32
0
20 Jun 2022
Deconstructing Distributions: A Pointwise Framework of Learning
Gal Kaplun
Nikhil Ghosh
Saurabh Garg
Boaz Barak
Preetum Nakkiran
OOD
38
21
0
20 Feb 2022
Data Scaling Laws in NMT: The Effect of Noise and Architecture
Yamini Bansal
Behrooz Ghorbani
Ankush Garg
Biao Zhang
M. Krikun
Colin Cherry
Behnam Neyshabur
Orhan Firat
42
47
0
04 Feb 2022
Tensor network to learn the wavefunction of data
A. Dymarsky
K. Pavlenko
32
6
0
15 Nov 2021
Practical Galaxy Morphology Tools from Deep Supervised Representation Learning
Mike Walmsley
Anna M. M. Scaife
Chris J. Lintott
Michelle Lochner
Yu Zhu
...
Xibo Ma
Sandor Kruk
Zhen Lei
G. Guo
B. Simmons
24
29
0
25 Oct 2021
A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?
Hiroaki Mikami
Kenji Fukumizu
Shogo Murai
Shuji Suzuki
Yuta Kikuchi
Taiji Suzuki
S. Maeda
Kohei Hayashi
42
12
0
25 Aug 2021
Explaining Neural Scaling Laws
Yasaman Bahri
Ethan Dyer
Jared Kaplan
Jaehoon Lee
Utkarsh Sharma
32
250
0
12 Feb 2021
Learning Curve Theory
Marcus Hutter
148
59
0
08 Feb 2021
Scaling Laws for Autoregressive Generative Modeling
T. Henighan
Jared Kaplan
Mor Katz
Mark Chen
Christopher Hesse
...
Nick Ryder
Daniel M. Ziegler
John Schulman
Dario Amodei
Sam McCandlish
53
408
0
28 Oct 2020
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
30
45
0
22 Jun 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
266
4,532
0
23 Jan 2020
1