Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks

26 March 2021

Papers citing "Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks"

45 / 95 papers shown

Title
The Bearable Lightness of Big Data: Towards Massive Public Datasets in Scientific Machine Learning Wai Tong Chung Kihoon Jung Jacqueline H. Chen M. Ihme AI4CE 24 3 0 25 Jul 2022
POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion Christian Joppi Geri Skenderi Marco Cristani 18 3 0 22 Jul 2022
Labeling instructions matter in biomedical image analysis Tim Radsch Annika Reinke V. Weru M. Tizabi Nicholas Schreck A. Emre Kavur Bunyamin Pekdemir T. Ross A. Kopp-Schneider Lena Maier-Hein 25 53 0 20 Jul 2022
Beyond Hard Labels: Investigating data label distributions Vasco Grossmann Lars Schmarje Reinhard Koch 23 11 0 13 Jul 2022
Is one annotation enough? A data-centric image classification benchmark for noisy and ambiguous label estimation Lars Schmarje Vasco Grossmann Claudius Zelenka S. Dippel R. Kiko ... M. Pastell J. Stracke A. Valros N. Volkmann Reinahrd Koch 45 34 0 13 Jul 2022
Eliciting and Learning with Soft Labels from Every Annotator K. M. Collins Umang Bhatt Adrian Weller 29 44 0 02 Jul 2022
Efficient Adversarial Training With Data Pruning Maximilian Kaufmann Yiren Zhao Ilia Shumailov Robert D. Mullins Nicolas Papernot AAML 42 7 0 01 Jul 2022
Distilling Model Failures as Directions in Latent Space Saachi Jain Hannah Lawrence Ankur Moitra A. Madry 23 90 0 29 Jun 2022
PARTICUL: Part Identification with Confidence measure using Unsupervised Learning Romain Xu-Darme Georges Quénot Zakaria Chihani M. Rousset 19 7 0 27 Jun 2022
Natural Backdoor Datasets Emily Wenger Roma Bhattacharjee A. Bhagoji Josephine Passananti Emilio Andere Haitao Zheng Ben Y. Zhao AAML 35 4 0 21 Jun 2022
The Privacy Onion Effect: Memorization is Relative Nicholas Carlini Matthew Jagielski Chiyuan Zhang Nicolas Papernot Andreas Terzis Florian Tramèr PILM MIACV 35 102 0 21 Jun 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting Zhengqi He Zeke Xie Quanzhi Zhu Zengchang Qin 81 27 0 17 Jun 2022
Differentiable Top-k Classification Learning Felix Petersen Hilde Kuehne Christian Borgelt Oliver Deussen 61 28 0 15 Jun 2022
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future Jan-Christoph Klie Bonnie Webber Iryna Gurevych 42 43 0 05 Jun 2022
Hide and Seek: on the Stealthiness of Attacks against Deep Learning Systems Zeyan Liu Fengjun Li Jingqiang Lin Zhu Li Bo Luo AAML 15 1 0 31 May 2022
CyCLIP: Cyclic Contrastive Language-Image Pretraining Shashank Goel Hritik Bansal S. Bhatia Ryan A. Rossi Vishwa Vinay Aditya Grover CLIP VLM 184 134 0 28 May 2022
Detecting Label Errors by using Pre-Trained Language Models Derek Chong Jenny Hong Christopher D. Manning NoLa 55 21 0 25 May 2022
DendroMap: Visual Exploration of Large-Scale Image Datasets for Machine Learning with Treemaps Donald Bertucci M. Hamid Yashwanthi Anand Anita Ruangrotsakun Delyar Tabatabai Melissa Perez Minsuk Kahng 43 29 0 14 May 2022
When does dough become a bagel? Analyzing the remaining mistakes on ImageNet Vijay Vasudevan Benjamin Caine Raphael Gontijo-Lopes Sara Fridovich-Keil Rebecca Roelofs VLM UQCV 48 57 0 09 May 2022
HumanAL: Calibrating Human Matching Beyond a Single Task Roee Shraga HAI 19 6 0 06 May 2022
Few-shot Learning with Noisy Labels Kevin J Liang Samrudhdhi B. Rangrej Vladan Petrovic Tal Hassner NoLa 30 47 0 12 Apr 2022
A Siren Song of Open Source Reproducibility Edward Raff Andrew L. Farris 16 9 0 09 Apr 2022
Towards Responsible Natural Language Annotation for the Varieties of Arabic A. S. Bergman Mona T. Diab 27 18 0 17 Mar 2022
Deconstructing Distributions: A Pointwise Framework of Learning Gal Kaplun Nikhil Ghosh Saurabh Garg Boaz Barak Preetum Nakkiran OOD 33 21 0 20 Feb 2022
Beyond Images: Label Noise Transition Matrix Estimation for Tasks with Lower-Quality Features Zhaowei Zhu Jialu Wang Yang Liu NoLa 38 37 0 02 Feb 2022
Towards Adversarial Evaluations for Inexact Machine Unlearning Shashwat Goel Ameya Prabhu Amartya Sanyal Ser-Nam Lim Philip Torr Ponnurangam Kumaraguru AAML ELM MU 46 50 0 17 Jan 2022
Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets Kofi Arhin Ioana Baldini Dennis L. Wei Karthikeyan N. Ramamurthy Moninder Singh 20 19 0 07 Dec 2021
Towards the One Learning Algorithm Hypothesis: A System-theoretic Approach Christos N. Mavridis John S. Baras 26 1 0 04 Dec 2021
MOTIF: A Large Malware Reference Dataset with Ground Truth Family Labels R. Joyce Dev Amlani B. Hamilton Edward Raff 44 21 0 29 Nov 2021
Smart Data Representations: Impact on the Accuracy of Deep Neural Networks Oliver Neumann Nicole Ludwig Marian Turowski Benedikt Heidrich V. Hagenmeyer Ralf Mikut AI4TS 42 1 0 17 Nov 2021
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning Yongchan Kwon James Zou TDI 39 122 0 26 Oct 2021
Domain Adaptation on Semantic Segmentation with Separate Affine Transformation in Batch Normalization Junhao Yan Woonsok Lee OOD 16 1 0 14 Oct 2021
Detecting Corrupted Labels Without Training a Model to Predict Zhaowei Zhu Zihao Dong Yang Liu NoLa 149 62 0 12 Oct 2021
Sample Noise Impact on Active Learning A. Abraham L. Dreyfus-Schmidt 24 3 0 03 Sep 2021
The Benchmark Lottery Mostafa Dehghani Yi Tay A. Gritsenko Zhe Zhao N. Houlsby Fernando Diaz Donald Metzler Oriol Vinyals 42 89 0 14 Jul 2021
Demystifying the Draft EU Artificial Intelligence Act Michael Veale Frederik J. Zuiderveen Borgesius 35 335 0 08 Jul 2021
The Feasibility and Inevitability of Stealth Attacks I. Tyukin D. Higham Alexander Bastounis Eliyas Woldegeorgis Alexander N. Gorban AAML 22 19 0 26 Jun 2021
VidHarm: A Clip Based Dataset for Harmful Content Detection Johan Edstedt Amanda Berg M. Felsberg Johan Karlsson Francisca Benavente Anette Novak G. Pihlgren 28 2 0 15 Jun 2021
Partial success in closing the gap between human and machine vision Robert Geirhos Kantharaju Narayanappa Benjamin Mitzkus Tizian Thieringer Matthias Bethge Felix Wichmann Wieland Brendel VLM AAML 48 221 0 14 Jun 2021
Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus Jack Bandy Nicholas Vincent 29 57 0 11 May 2021
Evaluating Deep Neural Networks Trained on Clinical Images in Dermatology with the Fitzpatrick 17k Dataset Matthew Groh Caleb Harris L. Soenksen Felix Lau Rachel Han Aerin Kim A. Koochek Omar Badri 112 184 0 20 Apr 2021
On Training Sketch Recognizers for New Domains Kemal Tugrul Yesilbek T. M. Sezgin 33 3 0 18 Apr 2021
Online Deterministic Annealing for Classification and Clustering Christos N. Mavridis John S. Baras ODL 22 17 0 11 Feb 2021
Local Label Point Correction for Edge Detection of Overlapping Cervical Cells Jiawei Liu Huijie Fan Qiang Wang Wentao Li Yandong Tang Danbo Wang Mingyi Zhou Li Chen 13 9 0 05 Oct 2020
Confident Learning: Estimating Uncertainty in Dataset Labels Curtis G. Northcutt Lu Jiang Isaac L. Chuang NoLa 43 674 0 31 Oct 2019