Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2206.07137
Cited By
Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
14 June 2022
Sören Mindermann
J. Brauner
Muhammed Razzak
Mrinank Sharma
Andreas Kirsch
Winnie Xu
Benedikt Höltgen
Aidan N. Gomez
Adrien Morisot
Sebastian Farquhar
Y. Gal
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt"
50 / 112 papers shown
Title
How to Train Data-Efficient LLMs
Noveen Sachdeva
Benjamin Coleman
Wang-Cheng Kang
Jianmo Ni
Lichan Hong
Ed H. Chi
James Caverlee
Julian McAuley
D. Cheng
29
51
0
15 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
80
186
0
06 Feb 2024
DsDm: Model-Aware Dataset Selection with Datamodels
Logan Engstrom
Axel Feldmann
A. Madry
OODD
15
47
0
23 Jan 2024
Generative Deduplication For Socia Media Data Selection
Xianming Li
Jing Li
29
2
0
11 Jan 2024
On the Convergence of Loss and Uncertainty-based Active Learning Algorithms
Daniel Haimovich
Dima Karamshuk
Fridolin Linder
Niek Tax
Milan Vojnovic
21
0
0
21 Dec 2023
Mitigating Label Bias in Machine Learning: Fairness through Confident Learning
Yixuan Zhang
Boyu Li
Zenan Ling
Feng Zhou
FaML
13
3
0
14 Dec 2023
Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding
Talfan Evans
Shreya Pathak
Hamza Merzic
Jonathan Schwarz
Ryutaro Tanno
Olivier J. Hénaff
18
16
0
08 Dec 2023
REDUCR: Robust Data Downsampling Using Class Priority Reweighting
William Bankes
George Hughes
Ilija Bogunovic
Zi Wang
28
3
0
01 Dec 2023
Computing Approximate
ℓ
p
\ell_p
ℓ
p
Sensitivities
Swati Padmanabhan
David P. Woodruff
Qiuyi Zhang
45
0
0
07 Nov 2023
AdaFlood: Adaptive Flood Regularization
Wonho Bae
Yi Ren
Mohamad Osama Ahmed
Frederick Tung
Danica J. Sutherland
Gabriel L. Oliveira
AI4CE
39
1
0
06 Nov 2023
Self-Influence Guided Data Reweighting for Language Model Pre-training
Megh Thakkar
Tolga Bolukbasi
Sriram Ganapathy
Shikhar Vashishth
Sarath Chandar
Partha P. Talukdar
MILM
32
20
0
02 Nov 2023
Data Optimization in Deep Learning: A Survey
Ou Wu
Rujing Yao
35
1
0
25 Oct 2023
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training
Truong Thao Nguyen
Balazs Gerofi
Edgar Josafat Martinez-Noriega
Franccois Trahay
M. Wahib
29
1
0
16 Oct 2023
Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
Mengzhou Xia
Tianyu Gao
Zhiyuan Zeng
Danqi Chen
40
267
0
10 Oct 2023
What do larger image classifiers memorise?
Michal Lukasik
Vaishnavh Nagarajan
A. S. Rawat
A. Menon
Sanjiv Kumar
30
5
0
09 Oct 2023
GRASP: A Rehearsal Policy for Efficient Online Continual Learning
Md Yousuf Harun
Jhair Gallardo
Junyu Chen
Christopher Kanan
CLL
36
9
0
25 Aug 2023
D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Kushal Tirumala
Daniel Simig
Armen Aghajanyan
Ari S. Morcos
SyDa
13
104
0
23 Aug 2023
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Dinesh Manocha
Jing Xiao
38
170
0
23 Aug 2023
Towards Accelerated Model Training via Bayesian Data Selection
Zhijie Deng
Peng Cui
Jun Zhu
16
4
0
21 Aug 2023
No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models
Jean Kaddour
Oscar Key
Piotr Nawrot
Pasquale Minervini
Matt J. Kusner
20
41
0
12 Jul 2023
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini
Sachin Goyal
Zachary Chase Lipton
J. Zico Kolter
Aditi Raghunathan
VLM
42
33
0
06 Jul 2023
Exploring Data Redundancy in Real-world Image Classification through Data Selection
Zhenyu Tang
Shaoting Zhang
Xiaosong Wang
22
2
0
25 Jun 2023
AdaSelection: Accelerating Deep Learning Training through Data Subsampling
Minghe Zhang
Chaosheng Dong
Jinmiao Fu
Tianchen Zhou
Jia Liang
...
Bo Liu
Michinari Momma
Bryan Wang
Yan Gao
Yi Sun
30
3
0
19 Jun 2023
Task-specific experimental design for treatment effect estimation
Beth D. Connolly
Kim Moore
Tobias Schwedes
Alexander Adam
Gary Willis
Ilya Feige
Christopher Frye
CML
19
3
0
08 Jun 2023
NLU on Data Diets: Dynamic Data Subset Selection for NLP Classification Tasks
Jean-Michel Attendu
Jean-Philippe Corbeil
33
15
0
05 Jun 2023
Towards Sustainable Learning: Coresets for Data-efficient Deep Learning
Yu Yang
Hao Kang
Baharan Mirzasoleiman
36
34
0
02 Jun 2023
Repeated Random Sampling for Minimizing the Time-to-Accuracy of Learning
Patrik Okanovic
R. Waleffe
Vasilis Mageirakos
Konstantinos E. Nikolakakis
Amin Karbasi
Dionysis Kalogerias
Nezihe Merve Gürel
Theodoros Rekatsinas
DD
45
12
0
28 May 2023
In-Context Demonstration Selection with Cross Entropy Difference
Dan Iter
Reid Pryzant
Ruochen Xu
Shuohang Wang
Yang Liu
Yichong Xu
Chenguang Zhu
23
11
0
24 May 2023
Selective Pre-training for Private Fine-tuning
Da Yu
Sivakanth Gopi
Janardhan Kulkarni
Zinan Lin
Saurabh Naik
Tomasz Religa
Jian Yin
Huishuai Zhang
35
19
0
23 May 2023
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
Sang Michael Xie
Hieu H. Pham
Xuanyi Dong
Nan Du
Hanxiao Liu
Yifeng Lu
Percy Liang
Quoc V. Le
Tengyu Ma
Adams Wei Yu
MoMe
MoE
47
177
0
17 May 2023
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception
Hassan Akbari
Dan Kondratyuk
Huayu Chen
Rachel Hornung
Haoran Wang
Hartwig Adam
VLM
MoE
30
11
0
10 May 2023
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction
Peng Cui
Dan Zhang
Zhijie Deng
Yinpeng Dong
Junyi Zhu
21
12
0
20 Apr 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review
Li Shen
Yan Sun
Zhiyuan Yu
Liang Ding
Xinmei Tian
Dacheng Tao
VLM
30
41
0
07 Apr 2023
InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Ziheng Qin
Kaidi Wang
Zangwei Zheng
Jianyang Gu
Xiang Peng
...
Daquan Zhou
Lei Shang
Baigui Sun
Xuansong Xie
Yang You
127
47
0
08 Mar 2023
Curriculum Based Multi-Task Learning for Parkinson's Disease Detection
Nikhil J. Dhinagar
Conor Owens-Walton
Emily Laltoo
C. Boyle
Yao-Liang Chen
...
Chih-Chien Tsai
Jiun-Jie Wang
Yih-Ru Wu
Y. D. Werf
Paul M. Thompson
10
3
0
27 Feb 2023
Data-Efficient Contrastive Self-supervised Learning: Most Beneficial Examples for Supervised Learning Contribute the Least
S. Joshi
Baharan Mirzasoleiman
SSL
27
19
0
18 Feb 2023
Confidence-based Reliable Learning under Dual Noises
Peng Cui
Yang Yue
Zhijie Deng
Jun Zhu
NoLa
33
8
0
10 Feb 2023
Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information
Yen-Ting Lin
Alexandros Papangelis
Seokhwan Kim
Sungjin Lee
Devamanyu Hazarika
Mahdi Namazifar
Di Jin
Yang Liu
Dilek Z. Hakkani-Tür
26
35
0
10 Feb 2023
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
35
170
0
06 Feb 2023
Cramming: Training a Language Model on a Single GPU in One Day
Jonas Geiping
Tom Goldstein
MoE
30
84
0
28 Dec 2022
Selective classification using a robust meta-learning approach
Nishant Jain
Karthikeyan Shanmugam
Pradeep Shenoy
OOD
26
2
0
12 Dec 2022
Instance-Conditional Timescales of Decay for Non-Stationary Learning
Nishant Jain
Pradeep Shenoy
27
3
0
12 Dec 2022
General Intelligence Requires Rethinking Exploration
Minqi Jiang
Tim Rocktaschel
Edward Grefenstette
LRM
29
17
0
15 Nov 2022
Reduce, Reuse, Recycle: Improving Training Efficiency with Distillation
Cody Blakeney
Jessica Zosa Forde
Jonathan Frankle
Ziliang Zong
Matthew L. Leavitt
VLM
24
4
0
01 Nov 2022
Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities
Brian Bartoldson
B. Kailkhura
Davis W. Blalock
31
47
0
13 Oct 2022
Robust Active Distillation
Cenk Baykal
Khoa Trinh
Fotis Iliopoulos
Gaurav Menghani
Erik Vee
31
10
0
03 Oct 2022
Stop Wasting My Time! Saving Days of ImageNet and BERT Training with Latest Weight Averaging
Jean Kaddour
MoMe
3DH
24
39
0
29 Sep 2022
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
42
27
0
20 Sep 2022
Prioritizing Samples in Reinforcement Learning with Reducible Loss
Shivakanth Sujit
Somjit Nath
Pedro H. M. Braga
Samira Ebrahimi Kahou
42
15
0
22 Aug 2022
Machine Learning with Confidential Computing: A Systematization of Knowledge
Fan Mo
Zahra Tarkhani
Hamed Haddadi
40
8
0
22 Aug 2022
Previous
1
2
3
Next