v1v2v3v4 (latest)

Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors

17 October 2024

Georgios Chochlakis

Alexandros Potamianos

Kristina Lerman

Shrikanth Narayanan

ArXiv (abs)PDF HTML

Papers citing "Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors"

33 / 33 papers shown

Title
Large Language Models Do Multi-Label Classification Differently Marcus Ma Georgios Chochlakis Niyantha Maruthu Pandiyan Jesse Thomason Shrikanth Narayanan 92 1 0 23 May 2025
Humans Hallucinate Too: Language Models Identify and Correct Subjective Annotation Errors With Label-in-a-Haystack Prompts Georgios Chochlakis Peter Wu Arjun Bedi Marcus Ma Kristina Lerman Shrikanth Narayanan 158 0 0 22 May 2025
Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks Georgios Chochlakis Niyantha Maruthu Pandiyan Kristina Lerman Shrikanth Narayanan ReLM KELM LRM 83 4 0 10 Sep 2024
GPT-4 Emulates Average-Human Emotional Cognition from a Third-Person Perspective Ala Nekouvaght Tak Jonathan Gratch 50 9 0 11 Aug 2024
How Does Quantization Affect Multilingual LLMs? Kelly Marchisio Saurabh Dash Hongyu Chen Dennis Aumiller Ahmet Üstün Sara Hooker Sebastian Ruder MQ 104 14 0 03 Jul 2024
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition Georgios Chochlakis Alexandros Potamianos Kristina Lerman Shrikanth Narayanan 77 7 0 25 Mar 2024
Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks Negar Mokhberian Myrl G. Marmarelis F. R. Hopp Valerio Basile Fred Morstatter Kristina Lerman 77 13 0 16 Nov 2023
GistScore: Learning Better Representations for In-Context Example Selection with Gist Bottlenecks Shivanshu Gupta Clemens Rosenbaum Ethan R. Elenberg LRM 60 8 0 16 Nov 2023
Modeling subjectivity (by Mimicking Annotator Annotation) in toxic comment identification across diverse communities Senjuti Dutta Sid Mittal Sherol Chen Deepak Ramachandran Ravi Rajakumar Ian D Kivlichan Sunny Mak Alena Butryna Praveen Paritosh University of Tennessee 87 7 0 01 Nov 2023
Measuring Faithfulness in Chain-of-Thought Reasoning Tamera Lanham Anna Chen Ansh Radhakrishnan Benoit Steiner Carson E. Denison ... Zac Hatfield-Dodds Jared Kaplan J. Brauner Sam Bowman Ethan Perez ReLM LRM 67 193 0 17 Jul 2023
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting Miles Turpin Julian Michael Ethan Perez Sam Bowman ReLM LRM 80 431 0 07 May 2023
Larger language models do in-context learning differently Jerry W. Wei Jason W. Wei Yi Tay Dustin Tran Albert Webson ... Xinyun Chen Hanxiao Liu Da Huang Denny Zhou Tengyu Ma ReLM LRM 97 372 0 07 Mar 2023
The political ideology of conversational AI: Converging evidence on ChatGPT's pro-environmental, left-libertarian orientation Jochen Hartmann Jasper Schwenzow Maximilian Witte 58 221 0 05 Jan 2023
Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion Georgios Chochlakis Gireesh Mahajan Sabyasachee Baruah Keith Burghardt Kristina Lerman Shrikanth Narayanan 71 24 0 28 Oct 2022
Noise Audits Improve Moral Foundation Classification Negar Mokhberian F. R. Hopp Bahareh Harandizadeh Fred Morstatter Kristina Lerman NoLa 68 7 0 13 Oct 2022
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models Aarohi Srivastava Abhinav Rastogi Abhishek Rao Abu Awal Md Shoeb Abubakar Abid ... Zhuoye Zhao Zijian Wang Zijie J. Wang Zirui Wang Ziyi Wu ELM 194 1,768 0 09 Jun 2022
Data Distributional Properties Drive Emergent In-Context Learning in Transformers Stephanie C. Y. Chan Adam Santoro Andrew Kyle Lampinen Jane X. Wang Aaditya K. Singh Pierre Harvey Richemond J. Mcclelland Felix Hill 141 263 0 22 Apr 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models Xuezhi Wang Jason W. Wei Dale Schuurmans Quoc Le Ed H. Chi Sharan Narang Aakanksha Chowdhery Denny Zhou ReLM BDL LRM AI4CE 519 3,646 0 21 Mar 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 880 12,973 0 04 Mar 2022
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? Sewon Min Xinxi Lyu Ari Holtzman Mikel Artetxe M. Lewis Hannaneh Hajishirzi Luke Zettlemoyer LLMAG LRM 163 1,485 0 25 Feb 2022
Learning To Retrieve Prompts for In-Context Learning Ohad Rubin Jonathan Herzig Jonathan Berant VPVLM RALM 86 707 0 16 Dec 2021
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Maarten Sap Swabha Swayamdipta Laura Vianna Xuhui Zhou Yejin Choi Noah A. Smith 81 283 0 15 Nov 2021
An Explanation of In-context Learning as Implicit Bayesian Inference Sang Michael Xie Aditi Raghunathan Percy Liang Tengyu Ma ReLM BDL VPVLM LRM 204 759 0 03 Nov 2021
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations Aida Mostafazadeh Davani Mark Díaz Vinodkumar Prabhakaran 61 315 0 12 Oct 2021
On Releasing Annotator-Level Labels and Information in Datasets Vinodkumar Prabhakaran Aida Mostafazadeh Davani Mark Díaz 87 149 0 12 Oct 2021
Survey Equivalence: A Procedure for Measuring Classifier Accuracy Against Human Labels Paul Resnick Yuqing Kong Grant Schoenebeck Tim Weninger 25 13 0 02 Jun 2021
SpanEmo: Casting Multi-label Emotion Classification as Span-prediction Hassan Alhuzali Sophia Ananiadou 101 90 0 25 Jan 2021
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics Swabha Swayamdipta Roy Schwartz Nicholas Lourie Yizhong Wang Hannaneh Hajishirzi Noah A. Smith Yejin Choi 115 448 0 22 Sep 2020
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 817 42,055 0 28 May 2020
GoEmotions: A Dataset of Fine-Grained Emotions Dorottya Demszky Dana Movshovitz-Attias Jeongwoo Ko Alan S. Cowen Gaurav Nemade Sujith Ravi AI4MH 87 714 0 01 May 2020
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them Hila Gonen Yoav Goldberg 103 571 0 09 Mar 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 94,891 0 11 Oct 2018
Semantics derived automatically from language corpora contain human-like biases Aylin Caliskan J. Bryson Arvind Narayanan 213 2,670 0 25 Aug 2016