ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.10102
  4. Cited By
Chinchilla Scaling: A replication attempt

Chinchilla Scaling: A replication attempt

15 April 2024
T. Besiroglu
Ege Erdil
Matthew Barnett
Josh You
ArXivPDFHTML

Papers citing "Chinchilla Scaling: A replication attempt"

21 / 21 papers shown
Title
Superposition Yields Robust Neural Scaling
Superposition Yields Robust Neural Scaling
Yizhou Liu
Ziming Liu
Jeff Gore
MILM
24
0
0
15 May 2025
Compute-Optimal LLMs Provably Generalize Better With Scale
Compute-Optimal LLMs Provably Generalize Better With Scale
Marc Finzi
Sanyam Kapoor
Diego Granziol
Anming Gu
Christopher De Sa
J. Zico Kolter
Andrew Gordon Wilson
32
0
0
21 Apr 2025
(Mis)Fitting: A Survey of Scaling Laws
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
69
2
0
26 Feb 2025
Forecasting Frontier Language Model Agent Capabilities
Forecasting Frontier Language Model Agent Capabilities
Govind Pimpale
Axel Højmark
Jérémy Scheurer
Marius Hobbhahn
LLMAG
ELM
49
1
0
21 Feb 2025
Physics of Skill Learning
Physics of Skill Learning
Ziming Liu
Yizhou Liu
Eric J. Michaud
Jeff Gore
Max Tegmark
46
1
0
21 Jan 2025
Loss-to-Loss Prediction: Scaling Laws for All Datasets
Loss-to-Loss Prediction: Scaling Laws for All Datasets
David Brandfonbrener
Nikhil Anand
Nikhil Vyas
Eran Malach
Sham Kakade
77
3
0
19 Nov 2024
Scaling Laws for Precision
Scaling Laws for Precision
Tanishq Kumar
Zachary Ankner
Benjamin Spector
Blake Bordelon
Niklas Muennighoff
Mansheej Paul
Cengiz Pehlevan
Christopher Ré
Aditi Raghunathan
AIFin
MoMe
46
13
0
07 Nov 2024
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity
Yuqi Luo
Chenyang Song
Xu Han
Yuxiao Chen
Chaojun Xiao
Zhiyuan Liu
Maosong Sun
49
3
0
04 Nov 2024
How Does Critical Batch Size Scale in Pre-training?
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang
Depen Morwani
Nikhil Vyas
Jingfeng Wu
Difan Zou
Udaya Ghai
Dean Phillips Foster
Sham Kakade
80
8
0
29 Oct 2024
TabDPT: Scaling Tabular Foundation Models
TabDPT: Scaling Tabular Foundation Models
Junwei Ma
Valentin Thomas
Rasa Hosseinzadeh
Hamidreza Kamkari
Alex Labach
Jesse C. Cresswell
Keyvan Golestan
Guangwei Yu
M. Volkovs
Anthony L. Caterini
LMTD
36
3
0
23 Oct 2024
Bayesian scaling laws for in-context learning
Bayesian scaling laws for in-context learning
Aryaman Arora
Dan Jurafsky
Christopher Potts
Noah D. Goodman
29
2
0
21 Oct 2024
On the Limitations of Compute Thresholds as a Governance Strategy
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
55
14
0
08 Jul 2024
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Resolving Discrepancies in Compute-Optimal Scaling of Language Models
Tomer Porian
Mitchell Wortsman
J. Jitsev
Ludwig Schmidt
Y. Carmon
60
20
0
27 Jun 2024
Time Matters: Scaling Laws for Any Budget
Time Matters: Scaling Laws for Any Budget
Itay Inbar
Luke Sernau
16
1
0
27 Jun 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language
  Models Collaborative Decoding
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
Kaiyan Zhang
Jianyu Wang
Ning Ding
Biqing Qi
Ermo Hua
Xingtai Lv
Bowen Zhou
43
9
0
18 Jun 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
44
15
0
12 Jun 2024
Reconciling Kaplan and Chinchilla Scaling Laws
Reconciling Kaplan and Chinchilla Scaling Laws
Tim Pearce
Jinyeop Song
34
8
0
12 Jun 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
55
64
0
25 Mar 2024
A Dynamical Model of Neural Scaling Laws
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
51
36
0
02 Feb 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
139
306
0
05 Jan 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
240
69
0
31 Dec 2023
1