ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2103.12407
  4. Cited By
Detecting Hate Speech with GPT-3

Detecting Hate Speech with GPT-3

23 March 2021
Ke-Li Chiu
Annie Collins
Rohan Alexander
    AILaw
ArXivPDFHTML

Papers citing "Detecting Hate Speech with GPT-3"

16 / 16 papers shown
Title
One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Haoran Gu
Handing Wang
Yi Mei
Mengjie Zhang
Yaochu Jin
27
0
0
12 May 2025
Digital Guardians: Can GPT-4, Perspective API, and Moderation API reliably detect hate speech in reader comments of German online newspapers?
Manuel Weber
Moritz Huber
Maximilian Auch
Alexander Döschl
Max-Emanuel Keller
P. Mandl
32
0
0
03 Jan 2025
Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms
Assessing Text Classification Methods for Cyberbullying Detection on Social Media Platforms
Adamu Gaston Philipo
Doreen Sebastian Sarwatt
Jianguo Ding
M. Daneshmand
Huansheng Ning
48
0
0
31 Dec 2024
Recent Advances in Attack and Defense Approaches of Large Language
  Models
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
57
1
0
05 Sep 2024
MisgenderMender: A Community-Informed Approach to Interventions for
  Misgendering
MisgenderMender: A Community-Informed Approach to Interventions for Misgendering
Tamanna Hossain
Sunipa Dev
Sameer Singh
35
5
0
23 Apr 2024
An Investigation of Large Language Models for Real-World Hate Speech
  Detection
An Investigation of Large Language Models for Real-World Hate Speech Detection
Keyan Guo
Alexander Hu
Jaden Mu
Ziheng Shi
Ziming Zhao
Nishant Vishwamitra
Hongxin Hu
25
12
0
07 Jan 2024
Generative AI for Hate Speech Detection: Evaluation and Findings
Generative AI for Hate Speech Detection: Evaluation and Findings
Sagi Pendzel
Tomer Wullach
Amir Adler
Einat Minkov
30
11
0
16 Nov 2023
Watch Your Language: Investigating Content Moderation with Large
  Language Models
Watch Your Language: Investigating Content Moderation with Large Language Models
Deepak Kumar
Y. AbuHashem
Zakir Durumeric
AI4MH
36
15
0
25 Sep 2023
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting
  Jailbreaks
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks
Abhinav Rao
S. Vashistha
Atharva Naik
Somak Aditya
Monojit Choudhury
35
17
0
24 May 2023
AI model GPT-3 (dis)informs us better than humans
AI model GPT-3 (dis)informs us better than humans
Giovanni Spitale
Nikola Biller-Andorno
Federico Germani
DeLMO
21
150
0
23 Jan 2023
GLM-130B: An Open Bilingual Pre-trained Model
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng
Xiao Liu
Zhengxiao Du
Zihan Wang
Hanyu Lai
...
Jidong Zhai
Wenguang Chen
Peng-Zhen Zhang
Yuxiao Dong
Jie Tang
BDL
LRM
253
1,073
0
05 Oct 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
61
3,500
0
02 May 2022
Leveraging pre-trained language models for conversational information
  seeking from text
Leveraging pre-trained language models for conversational information seeking from text
Patrizio Bellan
M. Dragoni
Chiara Ghidini
27
6
0
31 Mar 2022
Survey of Generative Methods for Social Media Analysis
Survey of Generative Methods for Social Media Analysis
Stan Matwin
Aristides Milios
P. Prałat
Amílcar Soares
Franccois Théberge
27
3
0
13 Dec 2021
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining
  Large Language Model Prompts
AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts
Tongshuang Wu
Michael Terry
Carrie J. Cai
LLMAG
AI4CE
LRM
37
447
0
04 Oct 2021
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based
  Bias in NLP
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP
Timo Schick
Sahana Udupa
Hinrich Schütze
259
374
0
28 Feb 2021
1