Designing Toxic Content Classification for a Diversity of Perspectives

Designing Toxic Content Classification for a Diversity of Perspectives

4 June 2021

Patrick Gage Kelley

Zakir Durumeric

Michael C. Bailey

Papers citing "Designing Toxic Content Classification for a Diversity of Perspectives"

16 / 16 papers shown

Title
Annotation alignment: Comparing LLM and human annotations of conversational safety Rajiv Movva Pang Wei Koh Emma Pierson ALM 41 3 0 10 Jun 2024
GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives Vinodkumar Prabhakaran Christopher Homan Lora Aroyo Aida Mostafazadeh Davani Alicia Parrish Alex S. Taylor Mark Díaz Ding Wang Greg Serapio-García 45 9 0 09 Nov 2023
Watch Your Language: Investigating Content Moderation with Large Language Models Deepak Kumar Y. AbuHashem Zakir Durumeric AI4MH 38 15 0 25 Sep 2023
Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting Tilman Beck Hendrik Schuff Anne Lauscher Iryna Gurevych 43 34 0 13 Sep 2023
The Ecological Fallacy in Annotation: Modelling Human Label Variation goes beyond Sociodemographics Matthias Orlikowski Paul Röttger Philipp Cimiano Italy 34 26 0 20 Jun 2023
Intersectionality in Conversational AI Safety: How Bayesian Multilevel Models Help Understand Diverse Perceptions of Safety Christopher Homan Greg Serapio-García Lora Aroyo Mark Díaz Alicia Parrish Vinodkumar Prabhakaran Alex S. Taylor Ding Wang 27 9 0 20 Jun 2023
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks Eve Fleisig Rediet Abebe Dan Klein 34 44 0 11 May 2023
How WEIRD is Usable Privacy and Security Research? (Extended Version) A. Hasegawa Daisuke Inoue Mitsuaki Akiyama 15 10 0 08 May 2023
Fairness Evaluation in Text Classification: Machine Learning Practitioner Perspectives of Individual and Group Fairness Zahra Ashktorab Benjamin Hoover Mayank Agarwal Casey Dugan Werner Geyer Han Yang Mikhail Yurochkin FaML 40 17 0 01 Mar 2023
Auditing large language models: a three-layered approach Jakob Mokander Jonas Schuett Hannah Rose Kirk Luciano Floridi AILaw MLAU 48 196 0 16 Feb 2023
Personalized Prediction of Offensive News Comments by Considering the Characteristics of Commenters Teruki Nakahara Taketoshi Ushiama AAML 16 0 0 26 Dec 2022
Validating Large Language Models with ReLM Michael Kuchnik Virginia Smith George Amvrosiadis 36 27 0 21 Nov 2022
Understanding Longitudinal Behaviors of Toxic Accounts on Reddit Deepak Kumar Jeffrey T. Hancock Kurt Thomas Zakir Durumeric 25 3 0 06 Sep 2022
How Well Do My Results Generalize Now? The External Validity of Online Privacy and Security Surveys Jenny Tang Eleanor Birrell Ada Lerner 218 26 0 28 Feb 2022
Automated Identification of Toxic Code Reviews Using ToxiCR Jaydeb Sarker Asif Kamal Turzo Mingyou Dong Amiangshu Bosu 27 31 0 26 Feb 2022
Jury Learning: Integrating Dissenting Voices into Machine Learning Models Mitchell L. Gordon Michelle S. Lam J. Park Kayur Patel Jeffrey T. Hancock Tatsunori Hashimoto Michael S. Bernstein 27 146 0 07 Feb 2022