Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.14749
Cited By
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks
26 March 2021
Curtis G. Northcutt
Anish Athalye
Jonas W. Mueller
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks"
50 / 92 papers shown
Title
The Pitfalls of Benchmarking in Algorithm Selection: What We Are Getting Wrong
G. Petelin
Gjorgjina Cenikj
34
0
0
12 May 2025
Adversarial Robustness of Deep Learning Models for Inland Water Body Segmentation from SAR Images
Siddharth Kothari
Srinivasan Murali
Sankalp Kothari
Ujjwal Verma
Jaya Sreevalsan-Nair
57
0
0
03 May 2025
When Dynamic Data Selection Meets Data Augmentation
Steve Yang
Peng Ye
Furao Shen
Dongzhan Zhou
42
0
0
02 May 2025
Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLM-Powered Assistance
Bo Yuan
Yulin Chen
Yin Zhang
Wei Jiang
NoLa
40
6
0
03 Apr 2025
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models
Jonathan Bourne
77
0
0
24 Feb 2025
DEUCE: Dual-diversity Enhancement and Uncertainty-awareness for Cold-start Active Learning
Jiaxin Guo
Cheng Chen
Shuzhen Li
Tianze Zhang
63
0
0
01 Feb 2025
CSA: Data-efficient Mapping of Unimodal Features to Multimodal Features
Po-han Li
Sandeep Chinchali
Ufuk Topcu
36
1
0
10 Oct 2024
Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory Annotations
David Tschirschwitz
Volker Rodehorst
31
1
0
14 Sep 2024
Assistive Image Annotation Systems with Deep Learning and Natural Language Capabilities: A Review
Moseli Motsóehli
VLM
3DV
37
0
0
28 Jun 2024
Data Valuation by Leveraging Global and Local Statistical Information
Xiaoling Zhou
Ou Wu
Michael K. Ng
Hao Jiang
TDI
30
0
0
23 May 2024
Are large language models superhuman chemists?
Adrian Mirza
Nawaf Alampara
Sreekanth Kunchapu
Benedict Emoekabu
Aswanth Krishnan
...
Leanne M. Stafast
Dinga Wonanke
Michael Pieler
P. Schwaller
Kevin Maik Jablonka
ELM
AI4MH
LRM
LM&MA
31
5
0
01 Apr 2024
Better than classical? The subtle art of benchmarking quantum machine learning models
Joseph Bowles
Shahnawaz Ahmed
Maria Schuld
42
65
0
11 Mar 2024
Corrective Machine Unlearning
Shashwat Goel
Ameya Prabhu
Philip Torr
Ponnurangam Kumaraguru
Amartya Sanyal
OnRL
42
14
0
21 Feb 2024
Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement
Esla Timothy Anzaku
Hyesoo Hong
Jin-Woo Park
Wonjun Yang
Kangmin Kim
Jongbum Won
Deshika Vinoshani Kumari Herath
Arnout Van Messem
W. D. Neve
20
0
0
31 Jan 2024
Investigating the Quality of DermaMNIST and Fitzpatrick17k Dermatological Image Datasets
Kumar Abhishek
Aditi Jain
Ghassan Hamarneh
49
3
0
25 Jan 2024
Towards Reliable Dermatology Evaluation Benchmarks
Fabian Gröger
Simone Lionetti
Philippe Gottfrois
Alvaro Gonzalez-Jimenez
Matthew Groh
Roxana Daneshjou
Labelling Consortium
Alexander A. Navarini
Marc Pouly
33
5
0
13 Sep 2023
Adaptive conformal classification with noisy labels
Matteo Sesia
Y. X. R. Wang
Xin Tong
24
13
0
10 Sep 2023
FPR Estimation for Fraud Detection in the Presence of Class-Conditional Label Noise
Justin Tittelfitz
31
0
0
04 Aug 2023
From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!
Giada Stivala
Sahar Abdelnabi
Andrea Mengascini
Mariano Graziano
Mario Fritz
Giancarlo Pellegrino
24
1
0
02 Aug 2023
LUCID-GAN: Conditional Generative Models to Locate Unfairness
Andres Algaba
Carmen Mazijn
Carina E. A. Prunkl
J. Danckaert
Vincent Ginis
SyDa
42
1
0
28 Jul 2023
On Evaluation of Document Classification using RVL-CDIP
Stefan Larson
Gordon Lim
Kevin Leach
39
3
0
21 Jun 2023
Quantifying lottery tickets under label noise: accuracy, calibration, and complexity
V. Arora
Daniele Irto
Sebastian Goldt
G. Sanguinetti
38
2
0
21 Jun 2023
Rapid Image Labeling via Neuro-Symbolic Learning
Yifeng Wang
Zhi Tu
Yiwen Xiang
Shiyuan Zhou
Xiyuan Chen
Bingxuan Li
Tianyi Zhang
VLM
37
6
0
18 Jun 2023
AI-Supported Assessment of Load Safety
Julius Schöning
Niklas Kruse
22
0
0
06 Jun 2023
MultiTurnCleanup: A Benchmark for Multi-Turn Spoken Conversational Transcript Cleanup
Hua Shen
Vicky Zayats
Johann C. Rocholl
D. D. Walker
Dirk Padfield
47
3
0
19 May 2023
NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in Natural Language Processing
Tingting Wu
Xiao Ding
Minji Tang
Haotian Zhang
Bing Qin
Ting Liu
NoLa
34
10
0
18 May 2023
Fairness and Bias in Truth Discovery Algorithms: An Experimental Analysis
Simone Lazier
Saravanan Thirumuruganathan
Hadis Anahideh
29
3
0
25 Apr 2023
Improved Naive Bayes with Mislabeled Data
Qianhan Zeng
Yingqiu Zhu
Xuening Zhu
Feifei Wang
Weichen Zhao
Shuning Sun
Meng Su
Hansheng Wang
NoLa
13
2
0
13 Apr 2023
Evaluation of Confidence-based Ensembling in Deep Learning Image Classification
Rafael Rosales
Peter Popov
Michael Paulitsch
UQCV
8
2
0
03 Mar 2023
Towards Unbounded Machine Unlearning
M. Kurmanji
Peter Triantafillou
Jamie Hayes
Eleni Triantafillou
MU
28
123
0
20 Feb 2023
ActiveLab: Active Learning with Re-Labeling by Multiple Annotators
Hui Wen Goh
Jonas W. Mueller
29
3
0
27 Jan 2023
Look Beyond Bias with Entropic Adversarial Data Augmentation
Thomas Duboudin
Emmanuel Dellandrea
Corentin Abgrall
Gilles Hénaff
Liming Chen
CML
35
4
0
10 Jan 2023
Learning from Training Dynamics: Identifying Mislabeled Data Beyond Manually Designed Features
Qingrui Jia
Xuhong Li
Lei Yu
Jiang Bian
Penghao Zhao
Shupeng Li
Haoyi Xiong
Dejing Dou
NoLa
35
5
0
19 Dec 2022
Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent
Richard Archibald
F. Bao
Yanzhao Cao
Hui‐Jie Sun
52
2
0
17 Dec 2022
Azimuth: Systematic Error Analysis for Text Classification
Gabrielle Gauthier Melançon
Orlando Marquez Ayala
Lindsay D. Brin
Chris Tyler
Frederic Branchaud-Charron
Joseph Marinier
Karine Grande
Dieu-Thu Le
16
3
0
16 Dec 2022
Measuring Annotator Agreement Generally across Complex Structured, Multi-object, and Free-text Annotation Tasks
Alexander Braylan
Omar Alonso
Matthew Lease
8
17
0
15 Dec 2022
The Grind for Good Data: Understanding ML Practitioners' Struggles and Aspirations in Making Good Data
Inha Cha
Juhyun Oh
Cheul Young Park
Jiyoon Han
Hwalsuk Lee
29
2
0
28 Nov 2022
Combating noisy labels in object detection datasets
K. Chachula
Jakub Lyskawa
Bartlomiej Olber
Piotr Fratczak
A. Popowicz
Krystian Radlak
NoLa
31
4
0
25 Nov 2022
Identifying Incorrect Annotations in Multi-Label Classification Data
Aditya Thyagarajan
Elías Snorrason
Curtis G. Northcutt
Jonas W. Mueller
37
10
0
25 Nov 2022
Quantifying the Impact of Label Noise on Federated Learning
Shuqi Ke
Chao Huang
Xin Liu
FedML
28
7
0
15 Nov 2022
DC-Check: A Data-Centric AI checklist to guide the development of reliable machine learning systems
Nabeel Seedat
F. Imrie
M. Schaar
27
12
0
09 Nov 2022
Seeing the Unseen: Errors and Bias in Visual Datasets
Hongrui Jin
29
0
0
03 Nov 2022
Unsupervised visualization of image datasets using contrastive learning
Jan Boehm
Philipp Berens
D. Kobak
SSL
26
15
0
18 Oct 2022
CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators
Hui Wen Goh
Ulyana Tkachenko
Jonas W. Mueller
19
10
0
13 Oct 2022
Detecting Label Errors in Token Classification Data
Wei-Chen Wang
Jonas W. Mueller
27
13
0
08 Oct 2022
Annealing Optimization for Progressive Learning with Stochastic Approximation
Christos N. Mavridis
John S. Baras
28
10
0
06 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
33
109
0
31 Aug 2022
Bugs in the Data: How ImageNet Misrepresents Biodiversity
A. Luccioni
David Rolnick
21
43
0
24 Aug 2022
The Bearable Lightness of Big Data: Towards Massive Public Datasets in Scientific Machine Learning
Wai Tong Chung
Kihoon Jung
Jacqueline H. Chen
M. Ihme
AI4CE
24
3
0
25 Jul 2022
POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion
Christian Joppi
Geri Skenderi
Marco Cristani
18
3
0
22 Jul 2022
1
2
Next