Racial Bias in Hate Speech and Abusive Language Detection Datasets

29 May 2019

Papers citing "Racial Bias in Hate Speech and Abusive Language Detection Datasets"

50 / 81 papers shown

Title
Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models Paloma Piot Patricia Martín-Rodilla Javier Parapar 50 0 0 04 May 2025
cantnlp@DravidianLangTech2025: A Bag-of-Sounds Approach to Multimodal Hate Speech Detection Sidney Wong Andrew Li 47 0 0 10 Mar 2025
MASS: Overcoming Language Bias in Image-Text Matching Jiwan Chung Seungwon Lim Sangkyu Lee Youngjae Yu VLM 32 0 0 20 Jan 2025
LLM-Human Pipeline for Cultural Context Grounding of Conversations Rajkumar Pujari Dan Goldwasser 38 1 0 17 Oct 2024
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks Fangru Lin Shaoguang Mao Emanuele La Malfa Valentin Hofmann Adrian de Wynter Jing Yao Si-Qing Chen Michael Wooldridge Furu Wei Furu Wei 51 2 0 14 Oct 2024
Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned Taisei Katô Yusuke Miyao 19 0 0 14 Sep 2024
Towards Generalized Offensive Language Identification A. Dmonte Tejas Arya Tharindu Ranasinghe Marcos Zampieri 52 3 0 26 Jul 2024
Situated Ground Truths: Enhancing Bias-Aware AI by Situating Data Labels with SituAnnotate Delfina Sol Martinez Pandiani Valentina Presutti 24 1 0 10 Jun 2024
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets Manuel Tonneau Diyi Liu Samuel Fraiberger Ralph Schroeder Scott A. Hale Paul Röttger 37 5 0 27 Apr 2024
Chinese Offensive Language Detection:Current Status and Future Directions Yunze Xiao Houda Bouamor Wajdi Zaghouani 43 2 0 27 Mar 2024
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks Somnath Banerjee Maulindu Sarkar Punyajoy Saha Binny Mathew Animesh Mukherjee TDI 34 0 0 22 Feb 2024
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models Yueqing Liang Lu Cheng Ali Payani Kai Shu 28 3 0 15 Nov 2023
Examining Temporal Bias in Abusive Language Detection Mali Jin Yida Mu Diana Maynard Kalina Bontcheva 34 5 0 25 Sep 2023
HateModerate: Testing Hate Speech Detectors against Content Moderation Policies Jiangrui Zheng Xueqing Liu Guanqun Yang Mirazul Haque Xing Qian Ravishka Rathnasuriya Wei Yang G. Budhrani 42 3 0 23 Jul 2023
A Weakly Supervised Classifier and Dataset of White Supremacist Language Michael Miller Yoder Ahmad Diab D. W. Brown Kathleen M. Carley 30 5 0 27 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation Rahul Madhavan Rishabh Garg Kahini Wadhawan S. Mehta 29 5 0 01 Jun 2023
Comparing Biases and the Impact of Multilingual Training across Multiple Languages Sharon Levy Neha Ann John Ling Liu Yogarshi Vyas Jie Ma Yoshinari Fujinuma Miguel Ballesteros Vittorio Castelli Dan Roth 26 25 0 18 May 2023
A statistical approach to detect sensitive features in a group fairness setting G. D. Pelegrina Miguel Couceiro L. Duarte 16 3 0 11 May 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models Emilio Ferrara SILM 36 247 0 07 Apr 2023
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks Antonis Maronikolakis Abdullatif Köksal Hinrich Schütze 43 0 0 04 Apr 2023
Rating Sentiment Analysis Systems for Bias through a Causal Lens Kausik Lakkaraju Biplav Srivastava Marco Valtorta 34 7 0 04 Feb 2023
A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models Xingmeng Zhao A. Niazi Anthony Rios 31 2 0 24 Dec 2022
Multi-VALUE: A Framework for Cross-Dialectal English NLP Caleb Ziems William B. Held Jingfeng Yang Jwala Dhamala Rahul Gupta Diyi Yang 46 40 0 15 Dec 2022
Multimodal and Explainable Internet Meme Classification A. Thakur Filip Ilievski Hông-Ân Sandlin Zhivar Sourati Luca Luceri Riccardo Tommasini Alain Mermoud 27 6 0 11 Dec 2022
Human-in-the-Loop Hate Speech Classification in a Multilingual Context Ana Kotarcic Dominik Hangartner Fabrizio Gilardi Selina Kurer K. Donnay 24 2 0 05 Dec 2022
Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering Helena Bonaldi Sara Dellantonio Serra Sinem Tekiroğlu Marco Guerini 29 41 0 07 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection Jiyun Kim Byounghan Lee Kyung-ah Sohn 21 13 0 01 Nov 2022
Detecting Unintended Social Bias in Toxic Language Datasets Nihar Ranjan Sahoo Himanshu Gupta P. Bhattacharyya 15 18 0 21 Oct 2022
Re-contextualizing Fairness in NLP: The Case of India Shaily Bhatt Sunipa Dev Partha P. Talukdar Shachi Dave Vinodkumar Prabhakaran 16 54 0 25 Sep 2022
Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models Virginia K. Felkner Ho-Chun Herbert Chang Eugene Jang Jonathan May OSLM 21 8 0 23 Jun 2022
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models Paul Röttger Haitham Seelawi Debora Nozza Zeerak Talat Bertie Vidgen 30 65 0 20 Jun 2022
Detecting Harmful Online Conversational Content towards LGBTQIA+ Individuals Jamell Dacon Harry Shomer Shaylynn Crum-Dacon Jiliang Tang 24 8 0 15 Jun 2022
KOLD: Korean Offensive Language Dataset Young-kuk Jeong Juhyun Oh Jaimeen Ahn Jongwon Lee Jihyung Mon Sungjoon Park Alice H. Oh 57 25 0 23 May 2022
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes Antonis Maronikolakis Philip Baader Hinrich Schütze 22 9 0 13 May 2022
Hidden behind the obvious: misleading keywords and implicitly abusive language on social media Wenjie Yin A. Zubiaga 31 26 0 03 May 2022
Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation Nitesh Goyal Ian D Kivlichan Rachel Rosen Lucy Vasserman 41 90 0 01 May 2022
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study Serra Sinem Tekiroğlu Helena Bonaldi Margherita Fanton Marco Guerini 24 43 0 04 Apr 2022
Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments Antonis Maronikolakis Axel Wisiorek Leah Nann Haris Jabbar Sahana Udupa Hinrich Schütze 24 24 0 22 Mar 2022
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers Alyssa Lees Vinh Q. Tran Yi Tay Jeffrey Scott Sorensen Jai Gupta Donald Metzler Lucy Vasserman 25 173 0 22 Feb 2022
Handling Bias in Toxic Speech Detection: A Survey Tanmay Garg Sarah Masud Tharun Suresh Tanmoy Chakraborty 17 91 0 26 Jan 2022
Causal effect of racial bias in data and machine learning algorithms on user persuasiveness & discriminatory decision making: An Empirical Study Kinshuk Sengupta Praveen Ranjan Srivastava 36 6 0 22 Jan 2022
Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases Shrimai Prabhumoye Rafal Kocielnik M. Shoeybi Anima Anandkumar Bryan Catanzaro 35 20 0 15 Dec 2021
Unraveling Social Perceptions & Behaviors towards Migrants on Twitter A. Khatua Wolfgang Nejdl 27 11 0 04 Dec 2021
"Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic H. Nghiem Fred Morstatter 25 8 0 04 Dec 2021
CO-STAR: Conceptualisation of Stereotypes for Analysis and Reasoning Teyun Kwon Anandha Gopalan 25 2 0 01 Dec 2021
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Maarten Sap Swabha Swayamdipta Laura Vianna Xuhui Zhou Yejin Choi Noah A. Smith 46 267 0 15 Nov 2021
Cross-lingual Hate Speech Detection using Transformer Models Teodor Tita A. Zubiaga 14 13 0 01 Nov 2021
BBQ: A Hand-Built Bias Benchmark for Question Answering Alicia Parrish Angelica Chen Nikita Nangia Vishakh Padmakumar Jason Phang Jana Thompson Phu Mon Htut Sam Bowman 223 374 0 15 Oct 2021
Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework Matan Halevy Camille Harris A. Bruckman Diyi Yang A. Howard 42 35 0 27 Sep 2021
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech Mai Elsherief Caleb Ziems D. Muchlinski Vaishnavi Anupindi Jordyn Seybolt M. D. Choudhury Diyi Yang 106 237 0 11 Sep 2021