Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.12673
Cited By
A Constructive Prediction of the Generalization Error Across Scales
27 September 2019
Jonathan S. Rosenfeld
Amir Rosenfeld
Yonatan Belinkov
Nir Shavit
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Constructive Prediction of the Generalization Error Across Scales"
50 / 159 papers shown
Title
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
240
69
0
31 Dec 2023
Tell, don't show: Declarative facts influence how LLMs generalize
Alexander Meinke
Owain Evans
23
7
0
12 Dec 2023
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey
Tianyu Ding
Tianyi Chen
Haidong Zhu
Jiachen Jiang
Yiqi Zhong
Jinxin Zhou
Guangzhi Wang
Zhihui Zhu
Ilya Zharkov
Luming Liang
29
22
0
01 Dec 2023
Green Edge AI: A Contemporary Survey
Yuyi Mao
X. Yu
Kaibin Huang
Ying-Jun Angela Zhang
Jun Zhang
41
17
0
01 Dec 2023
A Probabilistic Method to Predict Classifier Accuracy on Larger Datasets given Small Pilot Data
Ethan Harvey
Wansu Chen
David M Kent
Michael C. Hughes
21
1
0
29 Nov 2023
Show Your Work with Confidence: Confidence Bands for Tuning Curves
Nicholas Lourie
Kyunghyun Cho
He He
20
2
0
16 Nov 2023
Navigating Scaling Laws: Compute Optimality in Adaptive Model Training
Sotiris Anagnostidis
Gregor Bachmann
Imanol Schlag
Thomas Hofmann
33
2
0
06 Nov 2023
Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models
Pierre Colombo
Victor Pellegrain
Malik Boudiaf
Victor Storchan
Myriam Tami
Ismail Ben Ayed
C´eline Hudelot
Pablo Piantanida
38
8
0
21 Oct 2023
Reusing Pretrained Models by Multi-linear Operators for Efficient Training
Yu Pan
Ye Yuan
Yichun Yin
Zenglin Xu
Lifeng Shang
Xin Jiang
Qun Liu
44
16
0
16 Oct 2023
Strategies and impact of learning curve estimation for CNN-based image classification
Laura Didyk
Brayden Yarish
Michael A. Beck
C. Bidinosti
Christopher J. Henry
22
0
0
12 Oct 2023
A Neural Scaling Law from Lottery Ticket Ensembling
Ziming Liu
Max Tegmark
23
4
0
03 Oct 2023
D3: Data Diversity Design for Systematic Generalization in Visual Question Answering
Amir Rahimi
Vanessa D’Amario
Moyuru Yamada
Kentaro Takemoto
Tomotake Sasaki
Xavier Boix
36
1
0
15 Sep 2023
Pretraining on the Test Set Is All You Need
Rylan Schaeffer
15
28
0
13 Sep 2023
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
S. Mougiakakou
27
3
0
04 Sep 2023
LibriSQA: A Novel Dataset and Framework for Spoken Question Answering with Large Language Models
Zihan Zhao
Yiyang Jiang
Heyang Liu
Yanfeng Wang
Yu Wang
25
2
0
20 Aug 2023
CodeCoT: Tackling Code Syntax Errors in CoT Reasoning for Code Generation
Dong Huang
Qi Bu
Yuhao Qing
Heming Cui
LRM
32
16
0
17 Aug 2023
The semantic landscape paradigm for neural networks
Shreyas Gokhale
21
2
0
18 Jul 2023
Scaling Laws for Imitation Learning in Single-Agent Games
Jens Tuyls
Dhruv Madeka
Kari Torkkola
Dean Phillips Foster
Karthik Narasimhan
Sham Kakade
32
4
0
18 Jul 2023
Scaling Laws Do Not Scale
Fernando Diaz
Michael A. Madaio
23
8
0
05 Jul 2023
Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data
Alycia Lee
Brando Miranda
Sudharsan Sundar
Sanmi Koyejo
40
17
0
24 Jun 2023
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
34
38
0
23 Jun 2023
Delegated Classification
Eden Saig
Inbal Talgam-Cohen
Nir Rosenfeld
16
7
0
20 Jun 2023
SUR-adapter: Enhancing Text-to-Image Pre-trained Diffusion Models with Large Language Models
Shan Zhong
Zhongzhan Huang
Wushao Wen
Jinghui Qin
Liang Lin
24
40
0
09 May 2023
Are Emergent Abilities of Large Language Models a Mirage?
Rylan Schaeffer
Brando Miranda
Oluwasanmi Koyejo
LRM
44
396
0
28 Apr 2023
nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
Yiqun Yao
Siqi Fan
Xiusheng Huang
Xuezhi Fang
Xiang Li
...
Peng Han
Shuo Shang
Kang Liu
Aixin Sun
Yequan Wang
33
6
0
14 Apr 2023
k
k
k
NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference
Benfeng Xu
Quan Wang
Zhendong Mao
Yajuan Lyu
Qiaoqiao She
Yongdong Zhang
104
52
0
24 Mar 2023
The Quantization Model of Neural Scaling
Eric J. Michaud
Ziming Liu
Uzay Girit
Max Tegmark
MILM
27
77
0
23 Mar 2023
SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Amro Abbas
Kushal Tirumala
Daniel Simig
Surya Ganguli
Ari S. Morcos
25
162
0
16 Mar 2023
Kernel Regression with Infinite-Width Neural Networks on Millions of Examples
Ben Adlam
Jaehoon Lee
Shreyas Padhy
Zachary Nado
Jasper Snoek
20
11
0
09 Mar 2023
Robust mmWave Beamforming by Self-Supervised Hybrid Deep Learning
Fenghao Zhu
Bohao Wang
Zhaohui Yang
Chongwen Huang
Zhaoyang Zhang
G. C. Alexandropoulos
Chau Yuen
Merouane Debbah
19
12
0
09 Mar 2023
A Meta-Learning Approach to Predicting Performance and Data Requirements
Achin Jain
Gurumurthy Swaminathan
Paolo Favaro
Hao Yang
Avinash Ravichandran
...
Alessandro Achille
O. Dabeer
Bernt Schiele
A. Swaminathan
Stefano Soatto
37
8
0
02 Mar 2023
Learning to Grow Pretrained Models for Efficient Transformer Training
Peihao Wang
Yikang Shen
Lucas Torroba Hennigen
P. Greengard
Leonid Karlinsky
Rogerio Feris
David D. Cox
Zhangyang Wang
Yoon Kim
42
53
0
02 Mar 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
28
12,291
0
27 Feb 2023
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
38
29
0
19 Feb 2023
Cliff-Learning
T. T. Wang
I. Zablotchi
Nir Shavit
Jonathan S. Rosenfeld
36
0
0
14 Feb 2023
Data pruning and neural scaling laws: fundamental limitations of score-based algorithms
Fadhel Ayed
Soufiane Hayou
14
9
0
14 Feb 2023
Evaluating Self-Supervised Learning via Risk Decomposition
Yann Dubois
Tatsunori Hashimoto
Percy Liang
14
9
0
06 Feb 2023
Scaling Laws for Hyperparameter Optimization
Arlind Kadra
Maciej Janowski
Martin Wistuba
Josif Grabocka
25
9
0
01 Feb 2023
The case for 4-bit precision: k-bit Inference Scaling Laws
Tim Dettmers
Luke Zettlemoyer
MQ
24
214
0
19 Dec 2022
Power-law Scaling to Assist with Key Challenges in Artificial Intelligence
Yuval Meir
Shira Sardi
Shiri Hodassman
Karin Kisos
Itamar Ben-Noam
A. Goldental
Ido Kanter
22
16
0
15 Nov 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
36
51
0
30 Oct 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
30
74
0
26 Oct 2022
Measures of Information Reflect Memorization Patterns
Rachit Bansal
Danish Pruthi
Yonatan Belinkov
33
8
0
17 Oct 2022
Active Learning from the Web
Ryoma Sato
24
0
0
15 Oct 2022
Optimizing Data Collection for Machine Learning
Rafid Mahmood
James Lucas
J. Álvarez
Sanja Fidler
M. Law
93
26
0
03 Oct 2022
Data Budgeting for Machine Learning
Xin-Bo Zhao
Weixin Liang
James Zou
23
2
0
03 Oct 2022
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Oren Neumann
C. Gros
29
26
0
29 Sep 2022
Scaling Laws For Deep Learning Based Image Reconstruction
Tobit Klug
Reinhard Heckel
59
12
0
27 Sep 2022
Efficient Quantized Sparse Matrix Operations on Tensor Cores
Shigang Li
Kazuki Osawa
Torsten Hoefler
82
31
0
14 Sep 2022
Revisiting Neural Scaling Laws in Language and Vision
Ibrahim M. Alabdulmohsin
Behnam Neyshabur
Xiaohua Zhai
159
102
0
13 Sep 2022
Previous
1
2
3
4
Next