Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.14486
Cited By
Beyond neural scaling laws: beating power law scaling via data pruning
29 June 2022
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Beyond neural scaling laws: beating power law scaling via data pruning"
27 / 77 papers shown
Title
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
22
41
0
12 Jul 2023
Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources
Feiyang Kang
H. Just
Anit Kumar Sahu
R. Jia
59
10
0
05 Jul 2023
GIO: Gradient Information Optimization for Training Dataset Selection
Dante Everaert
Christopher Potts
23
3
0
20 Jun 2023
AdaSelection: Accelerating Deep Learning Training through Data Subsampling
Minghe Zhang
Chaosheng Dong
Jinmiao Fu
Tianchen Zhou
Jia Liang
...
Bo Liu
Michinari Momma
Bryan Wang
Yan Gao
Yi Sun
35
3
0
19 Jun 2023
NLU on Data Diets: Dynamic Data Subset Selection for NLP Classification Tasks
Jean-Michel Attendu
Jean-Philippe Corbeil
35
15
0
05 Jun 2023
Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning
Patrik Okanovic
R. Waleffe
Vasilis Mageirakos
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
Nezihe Merve Gürel
Theodoros Rekatsinas
DD
45
12
0
28 May 2023
Selective Pre-training for Private Fine-tuning
Da Yu
Sivakanth Gopi
Janardhan Kulkarni
Zinan Lin
Saurabh Naik
Tomasz Religa
Jian Yin
Huishuai Zhang
38
19
0
23 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
56
178
0
17 May 2023
DATED: Guidelines for Creating Synthetic Datasets for Engineering Design Applications
Cyril Picard
Jürg Schiffmann
Faez Ahmed
40
8
0
15 May 2023
The Vault: A Comprehensive Multilingual Dataset for Advancing Code Understanding and Generation
Dũng Nguyễn Mạnh
Nam Le Hai
An Dau
A. Nguyen
Khanh N. Nghiem
Jingnan Guo
Nghi D. Q. Bui
34
15
0
09 May 2023
Learning Trajectories are Generalization Indicators
Jingwen Fu
Zhizheng Zhang
Dacheng Yin
Yan Lu
Nanning Zheng
AI4CE
33
3
0
25 Apr 2023
The MiniPile Challenge for Data-Efficient Language Models
Jean Kaddour
MoE
ALM
24
40
0
17 Apr 2023
Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need
Vivien A. Cabannes
Léon Bottou
Yann LeCun
Randall Balestriero
48
13
0
27 Mar 2023
Cliff-Learning
T. T. Wang
I. Zablotchi
Nir Shavit
Jonathan S. Rosenfeld
38
0
0
14 Feb 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation
Noel Loo
Ramin Hasani
Mathias Lechner
Alexander Amini
Daniela Rus
DD
42
5
0
02 Feb 2023
A Survey on Efficient Training of Transformers
Bohan Zhuang
Jing Liu
Zizheng Pan
Haoyu He
Yuetian Weng
Chunhua Shen
31
47
0
02 Feb 2023
Data Distillation: A Survey
Noveen Sachdeva
Julian McAuley
DD
45
73
0
11 Jan 2023
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
36
51
0
30 Oct 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
30
74
0
26 Oct 2022
AVES: Animal Vocalization Encoder based on Self-Supervision
Masato Hagiwara
CLIP
VLM
AI4TS
19
24
0
26 Oct 2022
Improving Data Quality with Training Dynamics of Gradient Boosting Decision Trees
M. Ponti
L. Oliveira
Mathias Esteban
Valentina Garcia
J. Román
Luis Argerich
TDI
30
4
0
20 Oct 2022
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul
F. Chen
Brett W. Larsen
Jonathan Frankle
Surya Ganguli
Gintare Karolina Dziugaite
UQCV
32
38
0
06 Oct 2022
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
44
27
0
20 Sep 2022
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIP
VLM
47
99
0
10 Aug 2022
Can we achieve robustness from data alone?
Nikolaos Tsilivis
Jingtong Su
Julia Kempe
OOD
DD
36
18
0
24 Jul 2022
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
264
4,489
0
23 Jan 2020
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
296
39,217
0
01 Sep 2014
Previous
1
2