ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.00415
  4. Cited By
Don't Blame the Annotator: Bias Already Starts in the Annotation
  Instructions

Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions

1 May 2022
Mihir Parmar
Swaroop Mishra
Mor Geva
Chitta Baral
ArXivPDFHTML

Papers citing "Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions"

44 / 44 papers shown
Title
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach
Ethics and Persuasion in Reinforcement Learning from Human Feedback: A Procedural Rhetorical Approach
Shannon Lodoen
Alexi Orchard
13
0
0
14 May 2025
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals
Investigating the Capabilities and Limitations of Machine Learning for Identifying Bias in English Language Data with Information and Heritage Professionals
Lucy Havens
Benjamin Bach
Melissa Mhairi Terras
Beatrice Alex
49
0
0
01 Apr 2025
AI Alignment at Your Discretion
AI Alignment at Your Discretion
Maarten Buyl
Hadi Khalaf
C. M. Verdun
Lucas Monteiro Paes
Caio Vieira Machado
Flavio du Pin Calmon
45
0
0
10 Feb 2025
A Comprehensive Evaluation of Cognitive Biases in LLMs
A Comprehensive Evaluation of Cognitive Biases in LLMs
Simon Malberg
Roman Poletukhin
Carolin M. Schuster
Georg Groh
ELM
40
5
0
20 Oct 2024
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets
Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets
Tommaso Giorgi
Lorenzo Cima
T. Fagni
M. Avvenuti
S. Cresci
42
9
0
10 Oct 2024
Blind Spots and Biases: Exploring the Role of Annotator Cognitive Biases
  in NLP
Blind Spots and Biases: Exploring the Role of Annotator Cognitive Biases in NLP
Sanjana Gautam
Mukund Srinath
40
6
0
29 Apr 2024
Context Does Matter: Implications for Crowdsourced Evaluation Labels in
  Task-Oriented Dialogue Systems
Context Does Matter: Implications for Crowdsourced Evaluation Labels in Task-Oriented Dialogue Systems
Clemencia Siro
Mohammad Aliannejadi
Maarten de Rijke
43
3
0
15 Apr 2024
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from
  Human Feedback for LLMs
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
Shreyas Chaudhari
Pranjal Aggarwal
Vishvak Murahari
Tanmay Rajpurohit
Ashwin Kalyan
Karthik Narasimhan
Ameet Deshpande
Bruno Castro da Silva
29
34
0
12 Apr 2024
Position: Insights from Survey Methodology can Improve Training Data
Position: Insights from Survey Methodology can Improve Training Data
Stephanie Eckman
Barbara Plank
Frauke Kreuter
SyDa
41
3
0
02 Mar 2024
TRUCE: Private Benchmarking to Prevent Contamination and Improve
  Comparative Evaluation of LLMs
TRUCE: Private Benchmarking to Prevent Contamination and Improve Comparative Evaluation of LLMs
Tanmay Rajore
Nishanth Chandran
Sunayana Sitaram
Divya Gupta
Rahul Sharma
Kashish Mittal
Manohar Swaminathan
47
14
0
01 Mar 2024
Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale
  Annotations
Evaluating Webcam-based Gaze Data as an Alternative for Human Rationale Annotations
Stephanie Brandl
Oliver Eberle
Tiago F. R. Ribeiro
Anders Søgaard
Nora Hollenstein
40
1
0
29 Feb 2024
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions
  Without the Question?
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?
Nishant Balepur
Abhilasha Ravichander
Rachel Rudinger
ELM
40
19
0
19 Feb 2024
"Define Your Terms" : Enhancing Efficient Offensive Speech
  Classification with Definition
"Define Your Terms" : Enhancing Efficient Offensive Speech Classification with Definition
H. Nghiem
Umang Gupta
Fred Morstatter
39
4
0
05 Feb 2024
The Iron(ic) Melting Pot: Reviewing Human Evaluation in Humour, Irony
  and Sarcasm Generation
The Iron(ic) Melting Pot: Reviewing Human Evaluation in Humour, Irony and Sarcasm Generation
Tyler Loakman
Aaron Maladry
Chenghua Lin
18
7
0
09 Nov 2023
TarGEN: Targeted Data Generation with Large Language Models
TarGEN: Targeted Data Generation with Large Language Models
Himanshu Gupta
Kevin Scaria
Ujjwala Anantheswaran
Shreyas Verma
Mihir Parmar
Saurabh Arjun Sawant
Chitta Baral
Swaroop Mishra
SyDa
38
8
0
27 Oct 2023
Unveiling the Multi-Annotation Process: Examining the Influence of
  Annotation Quantity and Instance Difficulty on Model Performance
Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance
Pritam Kadasi
Mayank Singh
29
3
0
23 Oct 2023
An Empirical Study of Translation Hypothesis Ensembling with Large
  Language Models
An Empirical Study of Translation Hypothesis Ensembling with Large Language Models
António Farinhas
José G. C. de Souza
André F. T. Martins
31
8
0
17 Oct 2023
Take a Step Back: Evoking Reasoning via Abstraction in Large Language
  Models
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
Huaixiu Steven Zheng
Swaroop Mishra
Xinyun Chen
Heng-Tze Cheng
Ed H. Chi
Quoc V. Le
Denny Zhou
RALM
LRM
25
109
0
09 Oct 2023
WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural
  Language Instruction
WikiIns: A High-Quality Dataset for Controlled Text Editing by Natural Language Instruction
Xiang Chen
Zheng Li
Xiaojun Wan
21
0
0
08 Oct 2023
On the Challenges of Building Datasets for Hate Speech Detection
On the Challenges of Building Datasets for Hate Speech Detection
Vitthal Bhandari
20
1
0
06 Sep 2023
Towards Addressing the Misalignment of Object Proposal Evaluation for
  Vision-Language Tasks via Semantic Grounding
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding
Joshua Forster Feinglass
Yezhou Yang
24
2
0
01 Sep 2023
Never-ending Learning of User Interfaces
Never-ending Learning of User Interfaces
Jason Wu
Rebecca Krosnick
E. Schoop
Amanda Swearngin
Jeffrey P. Bigham
Jeffrey Nichols
VLM
HAI
19
15
0
17 Aug 2023
Uncertainty in Natural Language Generation: From Theory to Applications
Uncertainty in Natural Language Generation: From Theory to Applications
Joris Baan
Nico Daheim
Evgenia Ilia
Dennis Ulmer
Haau-Sing Li
Raquel Fernández
Barbara Plank
Rico Sennrich
Chrysoula Zerva
Wilker Aziz
UQLM
34
40
0
28 Jul 2023
Analyzing Dataset Annotation Quality Management in the Wild
Analyzing Dataset Annotation Quality Management in the Wild
Jan-Christoph Klie
Richard Eckart de Castilho
Iryna Gurevych
21
17
0
16 Jul 2023
Enough With "Human-AI Collaboration"
Enough With "Human-AI Collaboration"
Advait Sarkar
29
28
0
02 Jun 2023
On Degrees of Freedom in Defining and Testing Natural Language
  Understanding
On Degrees of Freedom in Defining and Testing Natural Language Understanding
Saku Sugawara
S. Tsugita
ELM
34
1
0
24 May 2023
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and
  Measurements of Performance
It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance
Arjun Subramonian
Xingdi Yuan
Hal Daumé
Su Lin Blodgett
47
17
0
15 May 2023
What's the Meaning of Superhuman Performance in Today's NLU?
What's the Meaning of Superhuman Performance in Today's NLU?
Simone Tedeschi
Johan Bos
T. Declerck
Jan Hajic
Daniel Hershcovich
...
Simon Krek
Steven Schockaert
Rico Sennrich
Ekaterina Shutova
Roberto Navigli
ELM
LM&MA
VLM
ReLM
LRM
36
26
0
15 May 2023
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural
  Language Generation
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes
Aman Madaan
Emmy Liu
António Farinhas
Pedro Henrique Martins
...
José G. C. de Souza
Shuyan Zhou
Tongshuang Wu
Graham Neubig
André F. T. Martins
ALM
117
56
0
01 May 2023
LINGO : Visually Debiasing Natural Language Instructions to Support Task
  Diversity
LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Anjana Arunkumar
Shubham Sharma
Rakhi Agrawal
Sriramakrishnan Chandrasekaran
Chris Bryan
34
0
0
12 Apr 2023
Large Language Model Instruction Following: A Survey of Progresses and
  Challenges
Large Language Model Instruction Following: A Survey of Progresses and Challenges
Renze Lou
Kai Zhang
Wenpeng Yin
ALM
LRM
32
20
0
18 Mar 2023
Fairness in Language Models Beyond English: Gaps and Challenges
Fairness in Language Models Beyond English: Gaps and Challenges
Krithika Ramesh
Sunayana Sitaram
Monojit Choudhury
32
23
0
24 Feb 2023
Real-Time Visual Feedback to Guide Benchmark Creation: A
  Human-and-Metric-in-the-Loop Workflow
Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow
Anjana Arunkumar
Swaroop Mishra
Bhavdeep Singh Sachdeva
Chitta Baral
Chris Bryan
30
0
0
09 Feb 2023
Investigating Labeler Bias in Face Annotation for Machine Learning
Investigating Labeler Bias in Face Annotation for Machine Learning
Luke Haliburton
Sinksar Ghebremedhin
Robin Welsch
Albrecht Schmidt
Sven Mayer
32
4
0
24 Jan 2023
Evaluating Human-Language Model Interaction
Evaluating Human-Language Model Interaction
Mina Lee
Megha Srivastava
Amelia Hardy
John Thickstun
Esin Durmus
...
Hancheng Cao
Tony Lee
Rishi Bommasani
Michael S. Bernstein
Percy Liang
LM&MA
ALM
58
99
0
19 Dec 2022
Leveraging Data Recasting to Enhance Tabular Reasoning
Leveraging Data Recasting to Enhance Tabular Reasoning
Aashna Jena
Vivek Gupta
Manish Shrivastava
Julian Martin Eisenschlos
LMTD
27
6
0
23 Nov 2022
A Survey of Parameters Associated with the Quality of Benchmarks in NLP
A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Swaroop Mishra
Anjana Arunkumar
Chris Bryan
Chitta Baral
37
1
0
14 Oct 2022
BioTABQA: Instruction Learning for Biomedical Table Question Answering
BioTABQA: Instruction Learning for Biomedical Table Question Answering
Man Luo
S. Saxena
Swaroop Mishra
Mihir Parmar
Chitta Baral
LMTD
157
15
0
06 Jul 2022
Experimental Standards for Deep Learning in Natural Language Processing
  Research
Experimental Standards for Deep Learning in Natural Language Processing Research
Dennis Ulmer
Elisa Bassignana
Max Müller-Eberstein
Daniel Varab
Mike Zhang
Rob van der Goot
Christian Hardmeier
Barbara Plank
19
10
0
13 Apr 2022
Less is More: Summary of Long Instructions is Better for Program
  Synthesis
Less is More: Summary of Long Instructions is Better for Program Synthesis
Kirby Kuznia
Swaroop Mishra
Mihir Parmar
Chitta Baral
AIMat
28
22
0
16 Mar 2022
PromptSource: An Integrated Development Environment and Repository for
  Natural Language Prompts
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
Stephen H. Bach
Victor Sanh
Zheng-Xin Yong
Albert Webson
Colin Raffel
...
Khalid Almubarak
Xiangru Tang
Dragomir R. Radev
Mike Tian-Jian Jiang
Alexander M. Rush
VLM
225
339
0
02 Feb 2022
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit
  Reasoning Strategies
Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies
Mor Geva
Daniel Khashabi
Elad Segal
Tushar Khot
Dan Roth
Jonathan Berant
RALM
250
677
0
06 Jan 2021
Are We Modeling the Task or the Annotator? An Investigation of Annotator
  Bias in Natural Language Understanding Datasets
Are We Modeling the Task or the Annotator? An Investigation of Annotator Bias in Natural Language Understanding Datasets
Mor Geva
Yoav Goldberg
Jonathan Berant
242
320
0
21 Aug 2019
Hypothesis Only Baselines in Natural Language Inference
Hypothesis Only Baselines in Natural Language Inference
Adam Poliak
Jason Naradowsky
Aparajita Haldar
Rachel Rudinger
Benjamin Van Durme
190
576
0
02 May 2018
1