Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions

1 May 2022

Papers citing "Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions"

44 / 44 papers shown

Title
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach Shannon Lodoen Alexi Orchard 13 0 0 14 May 2025
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals Lucy Havens Benjamin Bach Melissa Mhairi Terras Beatrice Alex 49 0 0 01 Apr 2025
AI Alignment at Your Discretion Maarten Buyl Hadi Khalaf C. M. Verdun Lucas Monteiro Paes Caio Vieira Machado Flavio du Pin Calmon 45 0 0 10 Feb 2025
A Comprehensive Evaluation of Cognitive Biases in LLMs Simon Malberg Roman Poletukhin Carolin M. Schuster Georg Groh ELM 40 5 0 20 Oct 2024
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets Tommaso Giorgi Lorenzo Cima T. Fagni M. Avvenuti S. Cresci 42 9 0 10 Oct 2024
Blind Spots and Biases: Exploring the Role of Annotator Cognitive Biases in NLP Sanjana Gautam Mukund Srinath 40 6 0 29 Apr 2024
Context Does Matter: Implications for Crowdsourced Evaluation Labels in Task-Oriented Dialogue Systems Clemencia Siro Mohammad Aliannejadi Maarten de Rijke 43 3 0 15 Apr 2024
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs Shreyas Chaudhari Pranjal Aggarwal Vishvak Murahari Tanmay Rajpurohit Ashwin Kalyan Karthik Narasimhan Ameet Deshpande Bruno Castro da Silva 29 34 0 12 Apr 2024
Position: Insights from Survey Methodology can Improve Training Data Stephanie Eckman Barbara Plank Frauke Kreuter SyDa 41 3 0 02 Mar 2024
TRUCE: Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs Tanmay Rajore Nishanth Chandran Sunayana Sitaram Divya Gupta Rahul Sharma Kashish Mittal Manohar Swaminathan 47 14 0 01 Mar 2024
Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale Annotations Stephanie Brandl Oliver Eberle Tiago F. R. Ribeiro Anders Søgaard Nora Hollenstein 40 1 0 29 Feb 2024
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? Nishant Balepur Abhilasha Ravichander Rachel Rudinger ELM 40 19 0 19 Feb 2024
"Define Your Terms" : Enhancing Efficient Offensive Speech Classification with Definition H. Nghiem Umang Gupta Fred Morstatter 39 4 0 05 Feb 2024
The Iron(ic) Melting Pot: Reviewing Human Evaluation in Humour, Irony and Sarcasm Generation Tyler Loakman Aaron Maladry Chenghua Lin 18 7 0 09 Nov 2023
TarGEN: Targeted Data Generation with Large Language Models Himanshu Gupta Kevin Scaria Ujjwala Anantheswaran Shreyas Verma Mihir Parmar Saurabh Arjun Sawant Chitta Baral Swaroop Mishra SyDa 38 8 0 27 Oct 2023
Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance Pritam Kadasi Mayank Singh 29 3 0 23 Oct 2023
An Empirical Study of Translation Hypothesis Ensembling with Large Language Models António Farinhas José G. C. de Souza André F. T. Martins 31 8 0 17 Oct 2023
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models Huaixiu Steven Zheng Swaroop Mishra Xinyun Chen Heng-Tze Cheng Ed H. Chi Quoc V. Le Denny Zhou RALM LRM 25 109 0 09 Oct 2023
WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction Xiang Chen Zheng Li Xiaojun Wan 21 0 0 08 Oct 2023
On the Challenges of Building Datasets for Hate Speech Detection Vitthal Bhandari 20 1 0 06 Sep 2023
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding Joshua Forster Feinglass Yezhou Yang 24 2 0 01 Sep 2023
Never-ending Learning of User Interfaces Jason Wu Rebecca Krosnick E. Schoop Amanda Swearngin Jeffrey P. Bigham Jeffrey Nichols VLM HAI 19 15 0 17 Aug 2023
Uncertainty in Natural Language Generation: From Theory to Applications Joris Baan Nico Daheim Evgenia Ilia Dennis Ulmer Haau-Sing Li Raquel Fernández Barbara Plank Rico Sennrich Chrysoula Zerva Wilker Aziz UQLM 34 40 0 28 Jul 2023
Analyzing Dataset Annotation Quality Management in the Wild Jan-Christoph Klie Richard Eckart de Castilho Iryna Gurevych 21 17 0 16 Jul 2023
Enough With "Human-AI Collaboration" Advait Sarkar 29 28 0 02 Jun 2023
On Degrees of Freedom in Defining and Testing Natural Language Understanding Saku Sugawara S. Tsugita ELM 34 1 0 24 May 2023
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance Arjun Subramonian Xingdi Yuan Hal Daumé Su Lin Blodgett 47 17 0 15 May 2023
What's the Meaning of Superhuman Performance in Today's NLU? Simone Tedeschi Johan Bos T. Declerck Jan Hajic Daniel Hershcovich ... Simon Krek Steven Schockaert Rico Sennrich Ekaterina Shutova Roberto Navigli ELM LM&MA VLM ReLM LRM 36 26 0 15 May 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation Patrick Fernandes Aman Madaan Emmy Liu António Farinhas Pedro Henrique Martins ... José G. C. de Souza Shuyan Zhou Tongshuang Wu Graham Neubig André F. T. Martins ALM 117 56 0 01 May 2023
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity Anjana Arunkumar Shubham Sharma Rakhi Agrawal Sriramakrishnan Chandrasekaran Chris Bryan 34 0 0 12 Apr 2023
Large Language Model Instruction Following: A Survey of Progresses and Challenges Renze Lou Kai Zhang Wenpeng Yin ALM LRM 32 20 0 18 Mar 2023
Fairness in Language Models Beyond English: Gaps and Challenges Krithika Ramesh Sunayana Sitaram Monojit Choudhury 32 23 0 24 Feb 2023
Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow Anjana Arunkumar Swaroop Mishra Bhavdeep Singh Sachdeva Chitta Baral Chris Bryan 30 0 0 09 Feb 2023
Investigating Labeler Bias in Face Annotation for Machine Learning Luke Haliburton Sinksar Ghebremedhin Robin Welsch Albrecht Schmidt Sven Mayer 32 4 0 24 Jan 2023
Evaluating Human-Language Model Interaction Mina Lee Megha Srivastava Amelia Hardy John Thickstun Esin Durmus ... Hancheng Cao Tony Lee Rishi Bommasani Michael S. Bernstein Percy Liang LM&MA ALM 58 99 0 19 Dec 2022
Leveraging Data Recasting to Enhance Tabular Reasoning Aashna Jena Vivek Gupta Manish Shrivastava Julian Martin Eisenschlos LMTD 27 6 0 23 Nov 2022
A Survey of Parameters Associated with the Quality of Benchmarks in NLP Swaroop Mishra Anjana Arunkumar Chris Bryan Chitta Baral 37 1 0 14 Oct 2022
BioTABQA: Instruction Learning for Biomedical Table Question Answering Man Luo S. Saxena Swaroop Mishra Mihir Parmar Chitta Baral LMTD 157 15 0 06 Jul 2022
Experimental Standards for Deep Learning in Natural Language Processing Research Dennis Ulmer Elisa Bassignana Max Müller-Eberstein Daniel Varab Mike Zhang Rob van der Goot Christian Hardmeier Barbara Plank 19 10 0 13 Apr 2022
Less is More: Summary of Long Instructions is Better for Program Synthesis Kirby Kuznia Swaroop Mishra Mihir Parmar Chitta Baral AIMat 28 22 0 16 Mar 2022
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts Stephen H. Bach Victor Sanh Zheng-Xin Yong Albert Webson Colin Raffel ... Khalid Almubarak Xiangru Tang Dragomir R. Radev Mike Tian-Jian Jiang Alexander M. Rush VLM 225 339 0 02 Feb 2022
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies Mor Geva Daniel Khashabi Elad Segal Tushar Khot Dan Roth Jonathan Berant RALM 250 677 0 06 Jan 2021
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets Mor Geva Yoav Goldberg Jonathan Berant 242 320 0 21 Aug 2019
Hypothesis Only Baselines in Natural Language Inference Adam Poliak Jason Naradowsky Aparajita Haldar Rachel Rudinger Benjamin Van Durme 190 576 0 02 May 2018