ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.10015
  4. Cited By
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training
  Dynamics

Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics

20 September 2022
Shoaib Ahmed Siddiqui
Nitarshan Rajkumar
Tegan Maharaj
David M. Krueger
Sara Hooker
ArXivPDFHTML

Papers citing "Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics"

29 / 29 papers shown
Title
Beyond neural scaling laws: beating power law scaling via data pruning
Beyond neural scaling laws: beating power law scaling via data pruning
Ben Sorscher
Robert Geirhos
Shashank Shekhar
Surya Ganguli
Ari S. Morcos
85
439
0
29 Jun 2022
Emergent Abilities of Large Language Models
Emergent Abilities of Large Language Models
Jason W. Wei
Yi Tay
Rishi Bommasani
Colin Raffel
Barret Zoph
...
Tatsunori Hashimoto
Oriol Vinyals
Percy Liang
J. Dean
W. Fedus
ELM
ReLM
LRM
263
2,462
0
15 Jun 2022
Selective Classification Via Neural Network Training Dynamics
Selective Classification Via Neural Network Training Dynamics
Stephan Rabanser
Anvith Thudi
Kimia Hamidieh
Adam Dziedzic
Nicolas Papernot
66
22
0
26 May 2022
Evaluating Distributional Distortion in Neural Language Modeling
Evaluating Distributional Distortion in Neural Language Modeling
Benjamin LeBrun
Alessandro Sordoni
Timothy J. O'Donnell
43
22
0
24 Mar 2022
Quantifying Memorization Across Neural Language Models
Quantifying Memorization Across Neural Language Models
Nicholas Carlini
Daphne Ippolito
Matthew Jagielski
Katherine Lee
Florian Tramèr
Chiyuan Zhang
PILM
100
614
0
15 Feb 2022
Just Train Twice: Improving Group Robustness without Training Group
  Information
Just Train Twice: Improving Group Robustness without Training Group Information
Emmy Liu
Behzad Haghgoo
Annie S. Chen
Aditi Raghunathan
Pang Wei Koh
Shiori Sagawa
Percy Liang
Chelsea Finn
OOD
84
559
0
19 Jul 2021
Deep Learning on a Data Diet: Finding Important Examples Early in
  Training
Deep Learning on a Data Diet: Finding Important Examples Early in Training
Mansheej Paul
Surya Ganguli
Gintare Karolina Dziugaite
105
456
0
15 Jul 2021
Deep Learning Through the Lens of Example Difficulty
Deep Learning Through the Lens of Example Difficulty
R. Baldock
Hartmut Maennel
Behnam Neyshabur
72
160
0
17 Jun 2021
Algorithmic Bias and Data Bias: Understanding the Relation between
  Distributionally Robust Optimization and Data Curation
Algorithmic Bias and Data Bias: Understanding the Relation between Distributionally Robust Optimization and Data Curation
Agnieszka Słowik
Léon Bottou
FaML
60
19
0
17 Jun 2021
Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing
Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing
Boaz Shmueli
Jan Fell
Soumya Ray
Lun-Wei Ku
149
87
0
20 Apr 2021
Learning Light-Weight Translation Models from Deep Transformer
Learning Light-Weight Translation Models from Deep Transformer
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
139
40
0
27 Dec 2020
Environment Inference for Invariant Learning
Environment Inference for Invariant Learning
Elliot Creager
J. Jacobsen
R. Zemel
OOD
57
382
0
14 Oct 2020
What Neural Networks Memorize and Why: Discovering the Long Tail via
  Influence Estimation
What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
Vitaly Feldman
Chiyuan Zhang
TDI
134
462
0
09 Aug 2020
An Investigation of Why Overparameterization Exacerbates Spurious
  Correlations
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
Shiori Sagawa
Aditi Raghunathan
Pang Wei Koh
Percy Liang
182
379
0
09 May 2020
DivideMix: Learning with Noisy Labels as Semi-supervised Learning
DivideMix: Learning with Noisy Labels as Semi-supervised Learning
Junnan Li
R. Socher
Guosheng Lin
NoLa
94
1,026
0
18 Feb 2020
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
Adam Roberts
Colin Raffel
Noam M. Shazeer
KELM
104
889
0
10 Feb 2020
Characterizing Structural Regularities of Labeled Data in
  Overparameterized Models
Characterizing Structural Regularities of Labeled Data in Overparameterized Models
Ziheng Jiang
Chiyuan Zhang
Kunal Talwar
Michael C. Mozer
TDI
56
102
0
08 Feb 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
532
4,773
0
23 Jan 2020
Scaling Out-of-Distribution Detection for Real-World Settings
Scaling Out-of-Distribution Detection for Real-World Settings
Dan Hendrycks
Steven Basart
Mantas Mazeika
Andy Zou
Joe Kwon
Mohammadreza Mostajabi
Jacob Steinhardt
D. Song
OODD
150
465
0
25 Nov 2019
Distributionally Robust Neural Networks for Group Shifts: On the
  Importance of Regularization for Worst-Case Generalization
Distributionally Robust Neural Networks for Group Shifts: On the Importance of Regularization for Worst-Case Generalization
Shiori Sagawa
Pang Wei Koh
Tatsunori B. Hashimoto
Percy Liang
OOD
85
1,236
0
20 Nov 2019
Accelerating Deep Learning by Focusing on the Biggest Losers
Accelerating Deep Learning by Focusing on the Biggest Losers
Angela H. Jiang
Daniel L.-K. Wong
Giulio Zhou
D. Andersen
J. Dean
...
Gauri Joshi
M. Kaminsky
M. Kozuch
Zachary Chase Lipton
Padmanabhan Pillai
54
121
0
02 Oct 2019
Hidden Stratification Causes Clinically Meaningful Failures in Machine
  Learning for Medical Imaging
Hidden Stratification Causes Clinically Meaningful Failures in Machine Learning for Medical Imaging
Luke Oakden-Rayner
Jared A. Dunnmon
G. Carneiro
Christopher Ré
OOD
67
379
0
27 Sep 2019
Combating Label Noise in Deep Learning Using Abstention
Combating Label Noise in Deep Learning Using Abstention
S. Thulasidasan
Tanmoy Bhattacharya
J. Bilmes
Gopinath Chennupati
J. Mohd-Yusof
NoLa
52
179
0
27 May 2019
Unsupervised Label Noise Modeling and Loss Correction
Unsupervised Label Noise Modeling and Loss Correction
Eric Arazo Sanchez
Diego Ortego
Paul Albert
Noel E. O'Connor
Kevin McGuinness
NoLa
74
610
0
25 Apr 2019
Probabilistic End-to-end Noise Correction for Learning with Noisy Labels
Probabilistic End-to-end Noise Correction for Learning with Noisy Labels
Kun Yi
Jianxin Wu
NoLa
66
414
0
19 Mar 2019
Deep Learning Predicts Hip Fracture using Confounding Patient and
  Healthcare Variables
Deep Learning Predicts Hip Fracture using Confounding Patient and Healthcare Variables
Giovanni Sutanto
J. Zech
Luke Oakden-Rayner
Yevgen Chebotar
Manway Liu
William Gale
M. McConnell
Ankur Handa
Thomas M. Snyder
Dieter Fox
AI4CE
OOD
74
244
0
08 Nov 2018
mixup: Beyond Empirical Risk Minimization
mixup: Beyond Empirical Risk Minimization
Hongyi Zhang
Moustapha Cissé
Yann N. Dauphin
David Lopez-Paz
NoLa
271
9,743
0
25 Oct 2017
A Closer Look at Memorization in Deep Networks
A Closer Look at Memorization in Deep Networks
Devansh Arpit
Stanislaw Jastrzebski
Nicolas Ballas
David M. Krueger
Emmanuel Bengio
...
Tegan Maharaj
Asja Fischer
Aaron Courville
Yoshua Bengio
Simon Lacoste-Julien
TDI
120
1,814
0
16 Jun 2017
Understanding Black-box Predictions via Influence Functions
Understanding Black-box Predictions via Influence Functions
Pang Wei Koh
Percy Liang
TDI
169
2,878
0
14 Mar 2017
1