Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts

22 May 2025

Papers citing "Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts"

24 / 24 papers shown

Title
Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 99 2 0 17 Oct 2024
How Does Quantization Affect Multilingual LLMs? Kelly Marchisio Saurabh Dash Hongyu Chen Dennis Aumiller Ahmet Üstün Sara Hooker Sebastian Ruder MQ 104 14 0 03 Jul 2024
Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias Rebecca Dorn Lee Kezar Fred Morstatter Kristina Lerman 79 11 0 23 May 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 68 7 0 25 Mar 2024
Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks Negar Mokhberian Myrl G. Marmarelis F. R. Hopp Valerio Basile Fred Morstatter Kristina Lerman 74 13 0 16 Nov 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities Senjuti Dutta Sid Mittal Sherol Chen Deepak Ramachandran Ravi Rajakumar Ian D Kivlichan Sunny Mak Alena Butryna Praveen Paritosh University of Tennessee 79 7 0 01 Nov 2023
Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting Tiantian Feng Shrikanth Narayanan 56 21 0 15 Sep 2023
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding Weijia Shi Xiaochuang Han M. Lewis Yulia Tsvetkov Luke Zettlemoyer Scott Yih HILM 56 207 0 24 May 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting Miles Turpin Julian Michael Ethan Perez Sam Bowman ReLM LRM 77 431 0 07 May 2023
We're Afraid Language Models Aren't Modeling Ambiguity Alisa Liu Zhaofeng Wu Julian Michael Alane Suhr Peter West Alexander Koller Swabha Swayamdipta Noah A. Smith Yejin Choi 95 101 0 27 Apr 2023
The political ideology of conversational AI: Converging evidence on ChatGPT's pro-environmental, left-libertarian orientation Jochen Hartmann Jasper Schwenzow Maximilian Witte 47 221 0 05 Jan 2023
Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion Georgios Chochlakis Gireesh Mahajan Sabyasachee Baruah Keith Burghardt Kristina Lerman Shrikanth Narayanan 63 24 0 28 Oct 2022
Noise Audits Improve Moral Foundation Classification Negar Mokhberian F. R. Hopp Bahareh Harandizadeh Fred Morstatter Kristina Lerman NoLa 62 7 0 13 Oct 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 874 12,973 0 04 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Sewon Min Xinxi Lyu Ari Holtzman Mikel Artetxe M. Lewis Hannaneh Hajishirzi Luke Zettlemoyer LLMAG LRM 161 1,485 0 25 Feb 2022
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Maarten Sap Swabha Swayamdipta Laura Vianna Xuhui Zhou Yejin Choi Noah A. Smith 81 283 0 15 Nov 2021
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations Aida Mostafazadeh Davani Mark Díaz Vinodkumar Prabhakaran 61 315 0 12 Oct 2021
Does Knowledge Distillation Really Work? Samuel Stanton Pavel Izmailov Polina Kirichenko Alexander A. Alemi A. Wilson FedML 63 220 0 10 Jun 2021
Survey Equivalence: A Procedure for Measuring Classifier Accuracy Against Human Labels Paul Resnick Yuqing Kong Grant Schoenebeck Tim Weninger 23 13 0 02 Jun 2021
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics Swabha Swayamdipta Roy Schwartz Nicholas Lourie Yizhong Wang Hannaneh Hajishirzi Noah A. Smith Yejin Choi 98 448 0 22 Sep 2020
GoEmotions: A Dataset of Fine-Grained Emotions Dorottya Demszky Dana Movshovitz-Attias Jeongwoo Ko Alan S. Cowen Gaurav Nemade Sujith Ravi AI4MH 85 714 0 01 May 2020
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? Alon Jacovi Yoav Goldberg XAI 117 597 0 07 Apr 2020
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 94,891 0 11 Oct 2018
Attention Is All You Need Ashish Vaswani Noam M. Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan Gomez Lukasz Kaiser Illia Polosukhin 3DV 701 131,652 0 12 Jun 2017