Hate Personified: Investigating the role of LLMs in content moderation

Hate Personified: Investigating the role of LLMs in content moderation

3 October 2024

Sahajpreet Singh

Alexander Fraser

Tanmoy Chakraborty

ArXiv (abs)PDF HTML

Papers citing "Hate Personified: Investigating the role of LLMs in content moderation"

5 / 5 papers shown

Title
Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models Shuzhou Yuan Ercong Nie Mario Tawfelis Helmut Schmid Hinrich Schütze Michael Färber 36 0 0 10 Jun 2025
Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study Faeze Ghorbanpour Daryna Dementieva Alexander Fraser 117 0 0 09 May 2025
Selective Demonstration Retrieval for Improved Implicit Hate Speech Detection Yumin Kim Donghoon Shin 102 0 0 16 Apr 2025
Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing Neemesh Yadav Jiarui Liu Francesco Ortu Roya Ensafi Zhijing Jin Rada Mihalcea 78 0 0 07 Mar 2025
Extreme Speech Classification in the Era of LLMs: Exploring Open-Source and Proprietary Models Sarthak Mahajan Nimmi Rangaswamy 115 0 0 24 Feb 2025