Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations

15 January 2020

Vasudeva Varma

Papers citing "Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations"

42 / 42 papers shown

Title
SWE2: SubWord Enriched and Significant Word Emphasized Framework for Hate Speech Detection Guanyi Mou Pengyi Ye Kyumin Lee 39 17 0 25 Sep 2024
An Effective, Robust and Fairness-aware Hate Speech Detection Framework Guanyi Mou Kyumin Lee 29 2 0 25 Sep 2024
NLPGuard: A Framework for Mitigating the Use of Protected Attributes by NLP Classifiers Salvatore Greco Ke Zhou L. Capra Tania Cerquitelli Daniele Quercia 36 2 0 01 Jul 2024
Disentangling Dialect from Social Bias via Multitask Learning to Improve Fairness Maximilian Spliethover Sai Nikhil Menon Henning Wachsmuth 44 2 0 14 Jun 2024
Hate Speech Detection with Generalizable Target-aware Fairness Tong Chen Danny Wang Xurong Liang Marten Risius Gianluca Demartini Hongzhi Yin 35 3 0 28 May 2024
Algorithmic Fairness: A Tolerance Perspective Renqiang Luo Tao Tang Feng Xia Jiaying Liu Chengpei Xu Leo Yu Zhang Wei Xiang Chengqi Zhang FaML 74 0 0 26 Apr 2024
ID-XCB: Data-independent Debiasing for Fair and Accurate Transformer-based Cyberbullying Detection Peiling Yi A. Zubiaga 29 0 0 26 Feb 2024
The Media Bias Taxonomy: A Systematic Literature Review on the Forms and Automated Detection of Media Bias Timo Spinde Smilla Hinterreiter Fabian Haak Terry Ruas Helge Giese Norman Meuschke Bela Gipp 27 12 0 26 Dec 2023
Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models Yueqing Liang Lu Cheng Ali Payani Kai Shu 28 3 0 15 Nov 2023
Studying Socially Unacceptable Discourse Classification (SUD) through different eyes: "Are we on the same page ?" Bruno Machado Carneiro Michele Linardi Julien Longhi 12 2 0 08 Aug 2023
Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP Xudong Han Timothy Baldwin Trevor Cohn 34 12 0 11 Feb 2023
A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models Xingmeng Zhao A. Niazi Anthony Rios 31 2 0 24 Dec 2022
Controlling Bias Exposure for Fair Interpretable Predictions Zexue He Yu-Xiang Wang Julian McAuley Bodhisattwa Prasad Majumder 27 19 0 14 Oct 2022
InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions Bodhisattwa Prasad Majumder Zexue He Julian McAuley 21 5 0 14 Oct 2022
Measuring Geographic Performance Disparities of Offensive Language Classifiers Brandon Lwowski P. Rad Anthony Rios 44 5 0 15 Sep 2022
SoK: Content Moderation in Social Media, from Guidelines to Enforcement, and Research to Practice Mohit Singhal Chen Ling Pujan Paudel Poojitha Thota Nihal Kumarswamy Gianluca Stringhini Shirin Nilizadeh 75 28 0 29 Jun 2022
StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes Awantee V. Deshpande Dana Ruiter Marius Mosbach Dietrich Klakow 14 11 0 27 May 2022
fairlib: A Unified Framework for Assessing and Improving Classification Fairness Xudong Han Aili Shen Yitong Li Lea Frermann Timothy Baldwin Trevor Cohn VLM FaML 23 12 0 04 May 2022
Hidden behind the obvious: misleading keywords and implicitly abusive language on social media Wenjie Yin A. Zubiaga 31 26 0 03 May 2022
Balancing Fairness and Accuracy in Sentiment Detection using Multiple Black Box Models Abdulaziz A. Almuzaini V. Singh MLAU FaML 36 6 0 22 Apr 2022
Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors I. Nejadgholi Kathleen C. Fraser S. Kiritchenko 16 18 0 05 Apr 2022
On Explaining Multimodal Hateful Meme Detection Models Ming Shan Hee Roy Ka-Wei Lee Wen-Haw Chong VLM 21 39 0 04 Apr 2022
Reinforcement Guided Multi-Task Learning Framework for Low-Resource Stereotype Detection Rajkumar Pujari Erik Oveson Priyanka Kulkarni E. Nouri 42 8 0 27 Mar 2022
Representation Bias in Data: A Survey on Identification and Resolution Techniques N. Shahbazi Yin Lin Abolfazl Asudeh H. V. Jagadish 48 68 0 22 Mar 2022
Suum Cuique: Studying Bias in Taboo Detection with a Community Perspective Osama Khalid Jonathan Rusert P. Srinivasan 11 1 0 22 Mar 2022
Towards Equal Opportunity Fairness through Adversarial Learning Xudong Han Timothy Baldwin Trevor Cohn FaML 25 8 0 12 Mar 2022
Handling Bias in Toxic Speech Detection: A Survey Tanmay Garg Sarah Masud Tharun Suresh Tanmoy Chakraborty 17 91 0 26 Jan 2022
Unintended Bias in Language Model-driven Conversational Recommendation Tianshu Shen Jiaru Li Mohamed Reda Bouadjenek Zheda Mai Scott Sanner 14 7 0 17 Jan 2022
Contrastive Learning for Fair Representations Aili Shen Xudong Han Trevor Cohn Timothy Baldwin Lea Frermann FaML 42 33 0 22 Sep 2021
Fairness-aware Class Imbalanced Learning Shivashankar Subramanian Afshin Rahimi Timothy Baldwin Trevor Cohn Lea Frermann FaML 109 28 0 21 Sep 2021
Balancing out Bias: Achieving Fairness Through Balanced Training Xudong Han Timothy Baldwin Trevor Cohn 26 39 0 16 Sep 2021
Unsupervised Domain Adaptation for Hate Speech Detection Using a Data Augmentation Approach Sheikh Muhammad Sarwar Vanessa Murdock 39 19 0 27 Jul 2021
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics Paula Czarnowska Yogarshi Vyas Kashif Shah 21 104 0 28 Jun 2021
An Information Retrieval Approach to Building Datasets for Hate Speech Detection Md. Mustafizur Rahman Dinesh Balakrishnan Dhiraj Murthy Mucahid Kutlu Matthew Lease 12 24 0 17 Jun 2021
Mitigating Biases in Toxic Language Detection through Invariant Rationalization Yung-Sung Chuang Mingye Gao Hongyin Luo James R. Glass Hung-yi Lee Yun-Nung Chen Shang-Wen Li 19 12 0 14 Jun 2021
A Neighbourhood Framework for Resource-Lean Content Flagging Sheikh Muhammad Sarwar Dimitrina Zlatkova Momchil Hardalov Yoan Dinkov Isabelle Augenstein Preslav Nakov 24 5 0 31 Mar 2021
Towards generalisable hate speech detection: a review on obstacles and solutions Wenjie Yin A. Zubiaga 117 164 0 17 Feb 2021
Task Adaptive Pretraining of Transformers for Hostility Detection Tathagata Raha Sayar Ghosh Roy Ujwal Narayan Zubair Abid Vasudeva Varma 15 9 0 09 Jan 2021
ALONE: A Dataset for Toxic Behavior among Adolescents on Twitter Thilini Wijesiriwardene Hale Inan Ugur Kursuncu Manas Gaur V. Shalin K. Thirunarayan A. Sheth I. Arpinar 13 39 0 14 Aug 2020
Hate Speech Detection and Racial Bias Mitigation in Social Media based on BERT model Marzieh Mozafari R. Farahbakhsh Noel Crespi 4 211 0 14 Aug 2020
Language (Technology) is Power: A Critical Survey of "Bias" in NLP Su Lin Blodgett Solon Barocas Hal Daumé Hanna M. Wallach 53 1,191 0 28 May 2020
Cyberbullying Detection with Fairness Constraints O. Gencoglu 13 48 0 09 May 2020