Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.10102
Cited By
Chinchilla Scaling: A replication attempt
15 April 2024
T. Besiroglu
Ege Erdil
Matthew Barnett
Josh You
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Chinchilla Scaling: A replication attempt"
21 / 21 papers shown
Title
Superposition Yields Robust Neural Scaling
Yizhou Liu
Ziming Liu
Jeff Gore
MILM
24
0
0
15 May 2025
Compute-Optimal LLMs Provably Generalize Better With Scale
Marc Finzi
Sanyam Kapoor
Diego Granziol
Anming Gu
Christopher De Sa
J. Zico Kolter
Andrew Gordon Wilson
32
0
0
21 Apr 2025
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
69
2
0
26 Feb 2025
Forecasting Frontier Language Model Agent Capabilities
Govind Pimpale
Axel Højmark
Jérémy Scheurer
Marius Hobbhahn
LLMAG
ELM
49
1
0
21 Feb 2025
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
46
1
0
21 Jan 2025
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener
Nikhil Anand
Nikhil Vyas
Eran Malach
Sham Kakade
77
3
0
19 Nov 2024
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
Cengiz Pehlevan
Christopher Ré
Aditi Raghunathan
AIFin
MoMe
46
13
0
07 Nov 2024
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Yuxiao Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
49
3
0
04 Nov 2024
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
80
8
0
29 Oct 2024
TabDPT: Scaling Tabular Foundation Models
Junwei Ma
Valentin Thomas
Rasa Hosseinzadeh
Hamidreza Kamkari
Alex Labach
Jesse C. Cresswell
Keyvan Golestan
Guangwei Yu
M. Volkovs
Anthony L. Caterini
LMTD
36
3
0
23 Oct 2024
Bayesian scaling laws for in-context learning
Aryaman Arora
Dan Jurafsky
Christopher Potts
Noah D. Goodman
29
2
0
21 Oct 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
55
14
0
08 Jul 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
60
20
0
27 Jun 2024
Time Matters: Scaling Laws for Any Budget
Itay Inbar
Luke Sernau
16
1
0
27 Jun 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
Kaiyan Zhang
Jianyu Wang
Ning Ding
Biqing Qi
Ermo Hua
Xingtai Lv
Bowen Zhou
43
9
0
18 Jun 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
44
15
0
12 Jun 2024
Reconciling Kaplan and Chinchilla Scaling Laws
Tim Pearce
Jinyeop Song
34
8
0
12 Jun 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
55
64
0
25 Mar 2024
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
51
36
0
02 Feb 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
306
0
05 Jan 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
240
69
0
31 Dec 2023
1