ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2410.11820
  4. Cited By
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

15 October 2024
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
ArXiv (abs)PDFHTML

Papers citing "Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws"

27 / 27 papers shown
Title
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Data Mixing Can Induce Phase Transitions in Knowledge Acquisition
Xinran Gu
Kaifeng Lyu
Jiazheng Li
Jingzhao Zhang
75
0
0
23 May 2025
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training
OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training
Juntao Zhao
Qi Lu
Wei Jia
Borui Wan
Lei Zuo
...
Size Zheng
Yanghua Peng
H. Lin
Xin Liu
Chuan Wu
AI4CE
132
0
0
14 Apr 2025
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions
Emmy Liu
Amanda Bertsch
Lintang Sutawika
Lindia Tjuatja
Patrick Fernandes
...
Siyang Song
Carolin (Haas) Lawrence
Aditi Raghunathan
Kiril Gashteovski
Graham Neubig
260
3
0
05 Mar 2025
Mixtera: A Data Plane for Foundation Model Training
Mixtera: A Data Plane for Foundation Model Training
Maximilian Böther
Xiaozhe Yao
Tolga Kerimoglu
Ana Klimovic
Viktor Gsteiger
Ana Klimovic
MoE
180
0
0
27 Feb 2025
MixMin: Finding Data Mixtures via Convex Minimization
MixMin: Finding Data Mixtures via Convex Minimization
Anvith Thudi
Evianne Rovers
Yangjun Ruan
Tristan Thrush
Chris J. Maddison
101
0
0
14 Feb 2025
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Paint Outside the Box: Synthesizing and Selecting Training Data for Visual Grounding
Zilin Du
Haoxin Li
Jianfei Yu
Boyang Li
483
0
0
01 Dec 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
116
1
0
06 Oct 2024
Scaling Laws for Data Filtering -- Data Curation cannot be Compute
  Agnostic
Scaling Laws for Data Filtering -- Data Curation cannot be Compute Agnostic
Sachin Goyal
Pratyush Maini
Zachary Chase Lipton
Aditi Raghunathan
J. Zico Kolter
100
46
0
10 Apr 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
121
76
0
25 Mar 2024
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Ziheng Qin
Kaidi Wang
Zangwei Zheng
Jianyang Gu
Xiang Peng
...
Daquan Zhou
Lei Shang
Baigui Sun
Xuansong Xie
Yang You
183
53
0
08 Mar 2023
Beyond neural scaling laws: beating power law scaling via data pruning
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
100
444
0
29 Jun 2022
Scaling Laws and Interpretability of Learning from Repeated Data
Scaling Laws and Interpretability of Learning from Repeated Data
Danny Hernandez
Tom B. Brown
Tom Conerly
Nova Dassarma
Dawn Drain
...
Catherine Olsson
Dario Amodei
Nicholas Joseph
Jared Kaplan
Sam McCandlish
77
118
0
21 May 2022
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Sid Black
Stella Biderman
Eric Hallahan
Quentin G. Anthony
Leo Gao
...
Shivanshu Purohit
Laria Reynolds
J. Tow
Benqi Wang
Samuel Weinbach
184
836
0
14 Apr 2022
Training Compute-Optimal Large Language Models
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
211
1,987
0
29 Mar 2022
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
362
637
0
14 Jul 2021
Deep Learning Through the Lens of Example Difficulty
Deep Learning Through the Lens of Example Difficulty
R. Baldock
Hartmut Maennel
Behnam Neyshabur
87
161
0
17 Jun 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
329
2,533
0
20 Apr 2021
Learning Curve Theory
Learning Curve Theory
Marcus Hutter
224
64
0
08 Feb 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
476
2,123
0
31 Dec 2020
The Cost of Training NLP Models: A Concise Overview
The Cost of Training NLP Models: A Concise Overview
Or Sharir
Barak Peleg
Y. Shoham
101
214
0
19 Apr 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
651
4,925
0
23 Jan 2020
PIQA: Reasoning about Physical Commonsense in Natural Language
PIQA: Reasoning about Physical Commonsense in Natural Language
Yonatan Bisk
Rowan Zellers
Ronan Le Bras
Jianfeng Gao
Yejin Choi
OODLRM
192
1,847
0
26 Nov 2019
Reparameterizable Subset Sampling via Continuous Relaxations
Reparameterizable Subset Sampling via Continuous Relaxations
Sang Michael Xie
Stefano Ermon
BDL
79
99
0
29 Jan 2019
CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance
CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance
Andrew Jesson
N. Guizard
Sina Hamidi Ghalehjegh
D. Goblot
F. Soudan
Nicolas Chapados
59
48
0
27 Jul 2018
Attention-Guided Curriculum Learning for Weakly Supervised
  Classification and Localization of Thoracic Diseases on Chest Radiographs
Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs
Yuxing Tang
Xiaosong Wang
Adam P. Harrison
Le Lu
Jing Xiao
Ronald M. Summers
79
154
0
19 Jul 2018
Automated Curriculum Learning for Neural Networks
Automated Curriculum Learning for Neural Networks
Alex Graves
Marc G. Bellemare
Jacob Menick
Rémi Munos
Koray Kavukcuoglu
101
530
0
10 Apr 2017
Ranking via Sinkhorn Propagation
Ranking via Sinkhorn Propagation
Ryan P. Adams
R. Zemel
110
148
0
09 Jun 2011
1