Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.16549
Cited By
v1
v2 (latest)
Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection
31 August 2023
Fatma Elsafoury
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection"
27 / 27 papers shown
Title
Systematic Offensive Stereotyping (SOS) Bias in Language Models
Fatma Elsafoury
13
2
0
21 Aug 2023
On the Origins of Bias in NLP through the Lens of the Jim Code
Fatma Elsafoury
Gavin Abercrombie
81
4
0
16 May 2023
Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks
Masahiro Kaneko
Danushka Bollegala
Naoaki Okazaki
56
45
0
06 Oct 2022
Explainable Abuse Detection as Intent Classification and Slot Filling
Agostina Calabrese
Bjorn Ross
Mirella Lapata
85
11
0
06 Oct 2022
On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations
Yang Trista Cao
Yada Pruksachatkun
Kai-Wei Chang
Rahul Gupta
Varun Kumar
Jwala Dhamala
Aram Galstyan
42
99
0
25 Mar 2022
ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection
Thomas Hartvigsen
Saadia Gabriel
Hamid Palangi
Maarten Sap
Dipankar Ray
Ece Kamar
78
384
0
17 Mar 2022
Intrinsic Bias Metrics Do Not Correlate with Application Bias
Seraphina Goldfarb-Tarrant
Rebecca Marchant
Ricardo Muñoz Sánchez
Mugdha Pandya
Adam Lopez
104
179
0
31 Dec 2020
Measuring and Reducing Gendered Correlations in Pre-trained Models
Kellie Webster
Xuezhi Wang
Ian Tenney
Alex Beutel
Emily Pitler
Ellie Pavlick
Jilin Chen
Ed Chi
Slav Petrov
FaML
79
260
0
12 Oct 2020
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
Nikita Nangia
Clara Vania
Rasika Bhalerao
Samuel R. Bowman
123
682
0
30 Sep 2020
Towards Debiasing Sentence Representations
Paul Pu Liang
Irene Li
Emily Zheng
Y. Lim
Ruslan Salakhutdinov
Louis-Philippe Morency
78
239
0
16 Jul 2020
StereoSet: Measuring stereotypical bias in pretrained language models
Moin Nadeem
Anna Bethke
Siva Reddy
101
1,011
0
20 Apr 2020
4chan & 8chan embeddings
Pierre Voué
T. Smedt
G. Pauw
AI4MH
32
5
0
02 Apr 2020
Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview
Deven Santosh Shah
H. Andrew Schwartz
Dirk Hovy
AI4CE
101
260
0
09 Nov 2019
A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media
Marzieh Mozafari
R. Farahbakhsh
Noel Crespi
67
352
0
28 Oct 2019
Perturbation Sensitivity Analysis to Detect Unintended Model Biases
Vinodkumar Prabhakaran
Ben Hutchinson
Margaret Mitchell
56
119
0
09 Oct 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
371
6,463
0
26 Sep 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
665
24,528
0
26 Jul 2019
Measuring Bias in Contextualized Word Representations
Keita Kurita
Nidhi Vyas
Ayush Pareek
A. Black
Yulia Tsvetkov
106
451
0
18 Jun 2019
On Measuring Social Biases in Sentence Encoders
Chandler May
Alex Jinpeng Wang
Shikha Bordia
Samuel R. Bowman
Rachel Rudinger
99
603
0
25 Mar 2019
Nuanced Metrics for Measuring Unintended Bias with Real Data for Text Classification
Daniel Borkan
Lucas Dixon
Jeffrey Scott Sorensen
Nithum Thain
Lucy Vasserman
88
491
0
11 Mar 2019
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
Hila Gonen
Yoav Goldberg
103
571
0
09 Mar 2019
Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting
Maria De-Arteaga
Alexey Romanov
Hanna M. Wallach
J. Chayes
C. Borgs
Alexandra Chouldechova
S. Geyik
K. Kenthapadi
Adam Tauman Kalai
191
460
0
27 Jan 2019
Attenuating Bias in Word Vectors
Sunipa Dev
J. M. Phillips
FaML
72
151
0
23 Jan 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,114
0
11 Oct 2018
Emo, Love, and God: Making Sense of Urban Dictionary, a Crowd-Sourced Online Dictionary
Dong Nguyen
Barbara McGillivray
T. Yasseri
3DV
26
26
0
22 Dec 2017
Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes
Nikhil Garg
L. Schiebinger
Dan Jurafsky
James Zou
AI4TS
69
965
0
22 Nov 2017
Semantics derived automatically from language corpora contain human-like biases
Aylin Caliskan
J. Bryson
Arvind Narayanan
213
2,670
0
25 Aug 2016
1