Correcting Annotator Bias in Training Data: Population-Aligned Instance Replication (PAIR)

v1v2 (latest)

Correcting Annotator Bias in Training Data: Population-Aligned Instance Replication (PAIR)

12 January 2025

Stephanie Eckman

ArXiv (abs)PDF HTML

Papers citing "Correcting Annotator Bias in Training Data: Population-Aligned Instance Replication (PAIR)"

18 / 18 papers shown

Title
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets Tommaso Giorgi Lorenzo Cima T. Fagni Marco Avvenuti S. Cresci 160 11 0 10 Oct 2024
The Perspectivist Paradigm Shift: Assumptions and Challenges of Capturing Human Labels Eve Fleisig Su Lin Blodgett Dan Klein Zeerak Talat 58 15 0 09 May 2024
How to be fair? A study of label and selection bias Marco Favier T. Calders Sam Pinxteren Jonathan Meyer 91 9 0 21 Mar 2024
Position: Insights from Survey Methodology can Improve Training Data Stephanie Eckman Barbara Plank Frauke Kreuter SyDa 65 5 0 02 Mar 2024
Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation Andrew Smart Ding Wang Ellis Monk Mark Díaz Atoosa Kasirzadeh Erin van Liemt Sonja Schmer-Galunder 83 8 0 09 Feb 2024
Annotation Sensitivity: Training Data Collection Methods Affect Model Performance Christoph Kern Stephanie Eckman Jacob Beck Rob Chew Bolei Ma Frauke Kreuter 61 10 0 23 Nov 2023
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks Eve Fleisig Rediet Abebe Dan Klein 77 49 0 11 May 2023
SemEval-2023 Task 11: Learning With Disagreements (LeWiDi) Elisa Leonardelli Alexandra Uma Gavin Abercrombie Dina Almanea Valerio Basile Tommaso Fornaciari Barbara Plank Verena Rieser Massimo Poesio 71 57 0 28 Apr 2023
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 891 13,228 0 04 Mar 2022
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Maarten Sap Swabha Swayamdipta Laura Vianna Xuhui Zhou Yejin Choi Noah A. Smith 89 283 0 15 Nov 2021
On Releasing Annotator-Level Labels and Information in Datasets Vinodkumar Prabhakaran Aida Mostafazadeh Davani Mark Díaz 93 150 0 12 Oct 2021
Representation Matters: Assessing the Importance of Subgroup Allocations in Training Data Esther Rolf Theodora Worledge Benjamin Recht Michael I. Jordan 62 33 0 05 Mar 2021
PyTorch: An Imperative Style, High-Performance Deep Learning Library Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury ... Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai Soumith Chintala ODL 568 42,677 0 03 Dec 2019
Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods Eyke Hüllermeier Willem Waegeman PER UD 258 1,432 0 21 Oct 2019
A Survey on Bias and Fairness in Machine Learning Ninareh Mehrabi Fred Morstatter N. Saxena Kristina Lerman Aram Galstyan SyDa FaML 578 4,391 0 23 Aug 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach Yinhan Liu Myle Ott Naman Goyal Jingfei Du Mandar Joshi Danqi Chen Omer Levy M. Lewis Luke Zettlemoyer Veselin Stoyanov AIMat 700 24,572 0 26 Jul 2019
On Calibration of Modern Neural Networks Chuan Guo Geoff Pleiss Yu Sun Kilian Q. Weinberger UQCV 299 5,877 0 14 Jun 2017
Automated Hate Speech Detection and the Problem of Offensive Language Thomas Davidson Dana Warmsley M. Macy Ingmar Weber 79 2,703 0 11 Mar 2017