Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1905.12516
Cited By
Racial Bias in Hate Speech and Abusive Language Detection Datasets
29 May 2019
Thomas Davidson
Debasmita Bhattacharya
Ingmar Weber
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Racial Bias in Hate Speech and Abusive Language Detection Datasets"
50 / 81 papers shown
Title
Personalisation or Prejudice? Addressing Geographic Bias in Hate Speech Detection using Debias Tuning in Large Language Models
Paloma Piot
Patricia Martín-Rodilla
Javier Parapar
50
0
0
04 May 2025
cantnlp@DravidianLangTech2025: A Bag-of-Sounds Approach to Multimodal Hate Speech Detection
Sidney Wong
Andrew Li
47
0
0
10 Mar 2025
MASS: Overcoming Language Bias in Image-Text Matching
Jiwan Chung
Seungwon Lim
Sangkyu Lee
Youngjae Yu
VLM
32
0
0
20 Jan 2025
LLM-Human Pipeline for Cultural Context Grounding of Conversations
Rajkumar Pujari
Dan Goldwasser
38
1
0
17 Oct 2024
One Language, Many Gaps: Evaluating Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Fangru Lin
Shaoguang Mao
Emanuele La Malfa
Valentin Hofmann
Adrian de Wynter
Jing Yao
Si-Qing Chen
Michael Wooldridge
Furu Wei
Furu Wei
51
2
0
14 Oct 2024
Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned
Taisei Katô
Yusuke Miyao
19
0
0
14 Sep 2024
Towards Generalized Offensive Language Identification
A. Dmonte
Tejas Arya
Tharindu Ranasinghe
Marcos Zampieri
52
3
0
26 Jul 2024
Situated Ground Truths: Enhancing Bias-Aware AI by Situating Data Labels with SituAnnotate
Delfina Sol Martinez Pandiani
Valentina Presutti
24
1
0
10 Jun 2024
From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets
Manuel Tonneau
Diyi Liu
Samuel Fraiberger
Ralph Schroeder
Scott A. Hale
Paul Röttger
37
5
0
27 Apr 2024
Chinese Offensive Language Detection:Current Status and Future Directions
Yunze Xiao
Houda Bouamor
Wajdi Zaghouani
43
2
0
27 Mar 2024
InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks
Somnath Banerjee
Maulindu Sarkar
Punyajoy Saha
Binny Mathew
Animesh Mukherjee
TDI
34
0
0
22 Feb 2024
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models
Yueqing Liang
Lu Cheng
Ali Payani
Kai Shu
28
3
0
15 Nov 2023
Examining Temporal Bias in Abusive Language Detection
Mali Jin
Yida Mu
Diana Maynard
Kalina Bontcheva
34
5
0
25 Sep 2023
HateModerate: Testing Hate Speech Detectors against Content Moderation Policies
Jiangrui Zheng
Xueqing Liu
Guanqun Yang
Mirazul Haque
Xing Qian
Ravishka Rathnasuriya
Wei Yang
G. Budhrani
42
3
0
23 Jul 2023
A Weakly Supervised Classifier and Dataset of White Supremacist Language
Michael Miller Yoder
Ahmad Diab
D. W. Brown
Kathleen M. Carley
30
5
0
27 Jun 2023
CFL: Causally Fair Language Models Through Token-level Attribute Controlled Generation
Rahul Madhavan
Rishabh Garg
Kahini Wadhawan
S. Mehta
29
5
0
01 Jun 2023
Comparing Biases and the Impact of Multilingual Training across Multiple Languages
Sharon Levy
Neha Ann John
Ling Liu
Yogarshi Vyas
Jie Ma
Yoshinari Fujinuma
Miguel Ballesteros
Vittorio Castelli
Dan Roth
26
25
0
18 May 2023
A statistical approach to detect sensitive features in a group fairness setting
G. D. Pelegrina
Miguel Couceiro
L. Duarte
16
3
0
11 May 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Emilio Ferrara
SILM
36
247
0
07 Apr 2023
Sociocultural knowledge is needed for selection of shots in hate speech detection tasks
Antonis Maronikolakis
Abdullatif Köksal
Hinrich Schütze
43
0
0
04 Apr 2023
Rating Sentiment Analysis Systems for Bias through a Causal Lens
Kausik Lakkaraju
Biplav Srivastava
Marco Valtorta
34
7
0
04 Feb 2023
A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models
Xingmeng Zhao
A. Niazi
Anthony Rios
31
2
0
24 Dec 2022
Multi-VALUE: A Framework for Cross-Dialectal English NLP
Caleb Ziems
William B. Held
Jingfeng Yang
Jwala Dhamala
Rahul Gupta
Diyi Yang
46
40
0
15 Dec 2022
Multimodal and Explainable Internet Meme Classification
A. Thakur
Filip Ilievski
Hông-Ân Sandlin
Zhivar Sourati
Luca Luceri
Riccardo Tommasini
Alain Mermoud
27
6
0
11 Dec 2022
Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Ana Kotarcic
Dominik Hangartner
Fabrizio Gilardi
Selina Kurer
K. Donnay
24
2
0
05 Dec 2022
Human-Machine Collaboration Approaches to Build a Dialogue Dataset for Hate Speech Countering
Helena Bonaldi
Sara Dellantonio
Serra Sinem Tekiroğlu
Marco Guerini
29
41
0
07 Nov 2022
Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection
Jiyun Kim
Byounghan Lee
Kyung-ah Sohn
21
13
0
01 Nov 2022
Detecting Unintended Social Bias in Toxic Language Datasets
Nihar Ranjan Sahoo
Himanshu Gupta
P. Bhattacharyya
15
18
0
21 Oct 2022
Re-contextualizing Fairness in NLP: The Case of India
Shaily Bhatt
Sunipa Dev
Partha P. Talukdar
Shachi Dave
Vinodkumar Prabhakaran
16
54
0
25 Sep 2022
Towards WinoQueer: Developing a Benchmark for Anti-Queer Bias in Large Language Models
Virginia K. Felkner
Ho-Chun Herbert Chang
Eugene Jang
Jonathan May
OSLM
21
8
0
23 Jun 2022
Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Paul Röttger
Haitham Seelawi
Debora Nozza
Zeerak Talat
Bertie Vidgen
30
65
0
20 Jun 2022
Detecting Harmful Online Conversational Content towards LGBTQIA+ Individuals
Jamell Dacon
Harry Shomer
Shaylynn Crum-Dacon
Jiliang Tang
24
8
0
15 Jun 2022
KOLD: Korean Offensive Language Dataset
Young-kuk Jeong
Juhyun Oh
Jaimeen Ahn
Jongwon Lee
Jihyung Mon
Sungjoon Park
Alice H. Oh
57
25
0
23 May 2022
Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Antonis Maronikolakis
Philip Baader
Hinrich Schütze
22
9
0
13 May 2022
Hidden behind the obvious: misleading keywords and implicitly abusive language on social media
Wenjie Yin
A. Zubiaga
31
26
0
03 May 2022
Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation
Nitesh Goyal
Ian D Kivlichan
Rachel Rosen
Lucy Vasserman
41
90
0
01 May 2022
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study
Serra Sinem Tekiroğlu
Helena Bonaldi
Margherita Fanton
Marco Guerini
24
43
0
04 Apr 2022
Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Antonis Maronikolakis
Axel Wisiorek
Leah Nann
Haris Jabbar
Sahana Udupa
Hinrich Schütze
24
24
0
22 Mar 2022
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers
Alyssa Lees
Vinh Q. Tran
Yi Tay
Jeffrey Scott Sorensen
Jai Gupta
Donald Metzler
Lucy Vasserman
25
173
0
22 Feb 2022
Handling Bias in Toxic Speech Detection: A Survey
Tanmay Garg
Sarah Masud
Tharun Suresh
Tanmoy Chakraborty
17
91
0
26 Jan 2022
Causal effect of racial bias in data and machine learning algorithms on user persuasiveness & discriminatory decision making: An Empirical Study
Kinshuk Sengupta
Praveen Ranjan Srivastava
36
6
0
22 Jan 2022
Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases
Shrimai Prabhumoye
Rafal Kocielnik
M. Shoeybi
Anima Anandkumar
Bryan Catanzaro
35
20
0
15 Dec 2021
Unraveling Social Perceptions & Behaviors towards Migrants on Twitter
A. Khatua
Wolfgang Nejdl
27
11
0
04 Dec 2021
"Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During the COVID-19 Pandemic
H. Nghiem
Fred Morstatter
25
8
0
04 Dec 2021
CO-STAR: Conceptualisation of Stereotypes for Analysis and Reasoning
Teyun Kwon
Anandha Gopalan
25
2
0
01 Dec 2021
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
Maarten Sap
Swabha Swayamdipta
Laura Vianna
Xuhui Zhou
Yejin Choi
Noah A. Smith
46
267
0
15 Nov 2021
Cross-lingual Hate Speech Detection using Transformer Models
Teodor Tita
A. Zubiaga
14
13
0
01 Nov 2021
BBQ: A Hand-Built Bias Benchmark for Question Answering
Alicia Parrish
Angelica Chen
Nikita Nangia
Vishakh Padmakumar
Jason Phang
Jana Thompson
Phu Mon Htut
Sam Bowman
223
374
0
15 Oct 2021
Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework
Matan Halevy
Camille Harris
A. Bruckman
Diyi Yang
A. Howard
42
35
0
27 Sep 2021
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech
Mai Elsherief
Caleb Ziems
D. Muchlinski
Vaishnavi Anupindi
Jordyn Seybolt
M. D. Choudhury
Diyi Yang
106
237
0
11 Sep 2021
1
2
Next