"Garbage In, Garbage Out" Revisited: What Do Machine Learning
Application Papers Report About Human-Labeled Training Data?

"Garbage In, Garbage Out" Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?

5 July 2021

Papers citing ""Garbage In, Garbage Out" Revisited: What Do Machine Learning Application Papers Report About Human-Labeled Training Data?"

16 / 16 papers shown

Title
A quest through interconnected datasets: lessons from highly-cited ICASSP papers Cynthia C. S. Liem Doğa Taşcılar Andrew M. Demetriou 30 0 0 19 Sep 2024
Situated Ground Truths: Enhancing Bias-Aware AI by Situating Data Labels with SituAnnotate Delfina Sol Martinez Pandiani Valentina Presutti 24 1 0 10 Jun 2024
The Unseen Targets of Hate -- A Systematic Review of Hateful Communication Datasets Zehui Yu Indira Sen Dennis Assenmacher Mattia Samory Leon Fröhling Christina Dahn Debora Nozza Claudia Wagner 35 5 0 14 May 2024
Best practices for machine learning in antibody discovery and development Leonard Wossnig Norbert Furtmann Andrew Buchanan Sandeep Kumar Victor Greiff 20 7 0 13 Dec 2023
Toxic language detection: a systematic review of Arabic datasets Imene Bensalem Paolo Rosso Hanane Zitouni 32 4 0 12 Dec 2023
Reproducibility in Multiple Instance Learning: A Case For Algorithmic Unit Tests Edward Raff James Holt 27 3 0 27 Oct 2023
Envisioning Narrative Intelligence: A Creative Visual Storytelling Anthology Brett A. Halperin S. Lukin CoGe 68 24 0 06 Oct 2023
Ground Truth Or Dare: Factors Affecting The Creation Of Medical Datasets For Training AI H. D. Zając Natalia-Rozalia Avlona T. O. Andersen F. Kensing Irina Shklovski 27 17 0 12 Aug 2023
Personalization of Stress Mobile Sensing using Self-Supervised Learning Tanvir Islam Peter Washington 24 6 0 04 Aug 2023
Closing the Loop: Testing ChatGPT to Generate Model Explanations to Improve Human Labelling of Sponsored Content on Social Media Thales Bertaglia Stefan Huber Catalina Goanta Gerasimos Spanakis Adriana Iamnitchi 22 11 0 08 Jun 2023
Changing Data Sources in the Age of Machine Learning for Official Statistics Cedric De Boom Michael Reusens 22 1 0 07 Jun 2023
MEGAnno: Exploratory Labeling for NLP in Computational Notebooks Dan Zhang H. Kim Rafael Li Chen Eser Kandogan Estevam R. Hruschka 11 3 0 08 Jan 2023
Mix-Pooling Strategy for Attention Mechanism Shan Zhong Wushao Wen Jinghui Qin 33 3 0 22 Aug 2022
A Siren Song of Open Source Reproducibility Edward Raff Andrew L. Farris 16 9 0 09 Apr 2022
Ground-Truth, Whose Truth? -- Examining the Challenges with Annotating Toxic Text Datasets Kofi Arhin Ioana Baldini Dennis L. Wei Karthikeyan N. Ramamurthy Moninder Singh 20 19 0 07 Dec 2021
A Survey on Bias and Fairness in Machine Learning Ninareh Mehrabi Fred Morstatter N. Saxena Kristina Lerman Aram Galstyan SyDa FaML 335 4,223 0 23 Aug 2019