Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.14891
Cited By
Broken Neural Scaling Laws
26 October 2022
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Broken Neural Scaling Laws"
50 / 61 papers shown
Title
Position: Enough of Scaling LLMs! Lets Focus on Downscaling
Ayan Sengupta
Yash Goel
Tanmoy Chakraborty
34
0
0
02 May 2025
On Model and Data Scaling for Skeleton-based Self-Supervised Gait Recognition
Adrian Cosma
Andy Catruna
Emilian Radoi
31
0
0
10 Apr 2025
Compression Laws for Large Language Models
Ayan Sengupta
Siddhant Chaudhary
Tanmoy Chakraborty
26
0
0
06 Apr 2025
A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules
Kairong Luo
Haodong Wen
Shengding Hu
Zhenbo Sun
Zhiyuan Liu
Maosong Sun
Kaifeng Lyu
Wenguang Chen
CLL
64
1
0
17 Mar 2025
Triple Phase Transitions: Understanding the Learning Dynamics of Large Language Models from a Neuroscience Perspective
Yuko Nakagi
Keigo Tada
Sota Yoshino
Shinji Nishimoto
Yu Takagi
LRM
37
0
0
28 Feb 2025
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
69
2
0
26 Feb 2025
Bayesian scaling laws for in-context learning
Aryaman Arora
Dan Jurafsky
Christopher Potts
Noah D. Goodman
24
2
0
21 Oct 2024
A Hitchhiker's Guide to Scaling Law Estimation
Leshem Choshen
Yang Zhang
Jacob Andreas
41
6
0
15 Oct 2024
Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra
Roman Worschech
B. Rosenow
41
0
0
11 Oct 2024
Dynamic neurons: A statistical physics approach for analyzing deep neural networks
Donghee Lee
Hye-Sung Lee
Jaeok Yi
21
1
0
01 Oct 2024
Scaling Optimal LR Across Token Horizons
Johan Bjorck
Alon Benhaim
Vishrav Chaudhary
Furu Wei
Xia Song
54
4
0
30 Sep 2024
Symbolic Regression with a Learned Concept Library
Arya Grayeli
Atharva Sehgal
Omar Costilla-Reyes
Miles Cranmer
Swarat Chaudhuri
56
10
0
14 Sep 2024
Scaling Law with Learning Rate Annealing
Howe Tissue
Venus Wang
Lu Wang
26
7
0
20 Aug 2024
Rethinking Learned Image Compression: Context is All You Need
Jixiang Luo
26
0
0
16 Jul 2024
MD tree: a model-diagnostic tree grown on loss landscape
Yefan Zhou
Jianlong Chen
Qinxue Cao
Konstantin Schürholt
Yaoqing Yang
33
2
0
24 Jun 2024
Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex
Spandan Madan
Will Xiao
Mingran Cao
Hanspeter Pfister
Margaret Livingstone
Gabriel Kreiman
OOD
66
4
0
16 Jun 2024
Fine-Tuned 'Small' LLMs (Still) Significantly Outperform Zero-Shot Generative AI Models in Text Classification
Martin Juan José Bucher
Marco Martini
ALM
AI4MH
29
25
0
12 Jun 2024
Scaling Laws in Linear Regression: Compute, Parameters, and Data
Licong Lin
Jingfeng Wu
Sham Kakade
Peter L. Bartlett
Jason D. Lee
LRM
41
15
0
12 Jun 2024
Phase Transitions in the Output Distribution of Large Language Models
Julian Arnold
Flemming Holtorf
Frank Schafer
Niels Lörch
41
1
0
27 May 2024
gzip Predicts Data-dependent Scaling Laws
Rohan Pandey
27
10
0
26 May 2024
From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
Jacob Russin
Sam Whitman McGrath
Danielle J. Williams
Lotem Elber-Dorozko
AI4CE
73
3
0
24 May 2024
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
Herilalaina Rakotoarison
Steven Adriaensen
Neeratyoy Mallik
Samir Garibov
Eddie Bergman
Frank Hutter
AI4CE
34
9
0
25 Apr 2024
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
40
112
0
22 Apr 2024
On the Scalability of GNNs for Molecular Graphs
Maciej Sypetkowski
Frederik Wenkel
Farimah Poursafaei
Nia Dickson
Karush Suri
Philip Fradkin
Dominique Beaini
GNN
AI4CE
39
12
0
17 Apr 2024
YaART: Yet Another ART Rendering Technology
Sergey Kastryulin
Artem Konev
Alexander Shishenya
Eugene Lyapustin
Artem Khurshudov
...
Dmitrii Kornilov
Mikhail Romanov
Artem Babenko
Sergei Ovcharenko
Valentin Khrulkov
EGVM
35
1
0
08 Apr 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
49
62
0
25 Mar 2024
Language models scale reliably with over-training and on downstream tasks
S. Gadre
Georgios Smyrnis
Vaishaal Shankar
Suchin Gururangan
Mitchell Wortsman
...
Y. Carmon
Achal Dave
Reinhard Heckel
Niklas Muennighoff
Ludwig Schmidt
ALM
ELM
LRM
108
40
0
13 Mar 2024
Scaling Up Adaptive Filter Optimizers
Jonah Casebeer
Nicholas J. Bryan
Paris Smaragdis
31
1
0
01 Mar 2024
Unified View of Grokking, Double Descent and Emergent Abilities: A Perspective from Circuits Competition
Yufei Huang
Shengding Hu
Xu Han
Zhiyuan Liu
Maosong Sun
64
14
0
23 Feb 2024
On Catastrophic Inheritance of Large Foundation Models
Hao Chen
Bhiksha Raj
Xing Xie
Jindong Wang
AI4CE
56
12
0
02 Feb 2024
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
51
36
0
02 Feb 2024
The Effect of Intrinsic Dataset Properties on Generalization: Unraveling Learning Differences Between Natural and Medical Images
N. Konz
Maciej Mazurowski
17
5
0
16 Jan 2024
How predictable is language model benchmark performance?
David Owen
ELM
LRM
27
19
0
09 Jan 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
232
69
0
31 Dec 2023
Vision-by-Language for Training-Free Compositional Image Retrieval
Shyamgopal Karthik
Karsten Roth
Massimiliano Mancini
Zeynep Akata
CoGe
28
52
0
13 Oct 2023
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
Kashif Rasul
Arjun Ashok
Andrew Robert Williams
Hena Ghonia
Rishika Bhagwatkar
...
Nicolas Chapados
Alexandre Drouin
Valentina Zantedeschi
Yuriy Nevmyvaka
Irina Rish
AI4TS
BDL
26
42
0
12 Oct 2023
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language Models
Ahmad Faiz
S. Kaneda
Ruhan Wang
Rita Osi
Parteek Sharma
Fan Chen
Lei Jiang
31
56
0
25 Sep 2023
Amplifying Pathological Detection in EEG Signaling Pathways through Cross-Dataset Transfer Learning
Mohammad Javad Darvishi Bayazi
M. S. Ghaemi
Timothée Lesort
Md Rifat Arefin
Jocelyn Faubert
Irina Rish
24
11
0
19 Sep 2023
Scaling Laws for Sparsely-Connected Foundation Models
Elias Frantar
C. Riquelme
N. Houlsby
Dan Alistarh
Utku Evci
30
35
0
15 Sep 2023
Uncovering Neural Scaling Laws in Molecular Representation Learning
Dingshuo Chen
Yanqiao Zhu
Jieyu Zhang
Yuanqi Du
Zhixun Li
Qiang Liu
Shu Wu
Liang Wang
32
16
0
15 Sep 2023
Sudden Drops in the Loss: Syntax Acquisition, Phase Transitions, and Simplicity Bias in MLMs
Angelica Chen
Ravid Schwartz-Ziv
Kyunghyun Cho
Matthew L. Leavitt
Naomi Saphra
29
62
0
13 Sep 2023
Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
Zheng Yuan
Hongyi Yuan
Cheng Li
Guanting Dong
Keming Lu
Chuanqi Tan
Chang Zhou
Jingren Zhou
LRM
ALM
25
160
0
03 Aug 2023
Scaling Laws for Imitation Learning in Single-Agent Games
Jens Tuyls
Dhruv Madeka
Kari Torkkola
Dean Phillips Foster
Karthik Narasimhan
Sham Kakade
26
4
0
18 Jul 2023
Scaling Laws Do Not Scale
Fernando Diaz
Michael A. Madaio
23
8
0
05 Jul 2023
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
32
38
0
23 Jun 2023
Eliciting the Translation Ability of Large Language Models via Multilingual Finetuning with Translation Instructions
Jiahuan Li
Hao Zhou
Shujian Huang
Shan Chen
Jiajun Chen
LRM
33
54
0
24 May 2023
How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench
Qinyuan Ye
Harvey Yiyun Fu
Xiang Ren
Robin Jia
ELM
24
21
0
24 May 2023
Are Emergent Abilities of Large Language Models a Mirage?
Rylan Schaeffer
Brando Miranda
Oluwasanmi Koyejo
LRM
39
396
0
28 Apr 2023
DataComp: In search of the next generation of multimodal datasets
S. Gadre
Gabriel Ilharco
Alex Fang
J. Hayase
Georgios Smyrnis
...
A. Dimakis
J. Jitsev
Y. Carmon
Vaishaal Shankar
Ludwig Schmidt
VLM
30
408
0
27 Apr 2023
Emergent and Predictable Memorization in Large Language Models
Stella Biderman
USVSN Sai Prashanth
Lintang Sutawika
Hailey Schoelkopf
Quentin G. Anthony
Shivanshu Purohit
Edward Raf
24
116
0
21 Apr 2023
1
2
Next