ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.16120
  4. Cited By
A Data-Centric Approach for Safe and Secure Large Language Models against Threatening and Toxic Content

A Data-Centric Approach for Safe and Secure Large Language Models against Threatening and Toxic Content

19 April 2025
Chaima Njeh
Haïfa Nakouri
Fehmi Jaafar
ArXiv (abs)PDFHTML

Papers citing "A Data-Centric Approach for Safe and Secure Large Language Models against Threatening and Toxic Content"

5 / 5 papers shown
Title
N-Critics: Self-Refinement of Large Language Models with Ensemble of
  Critics
N-Critics: Self-Refinement of Large Language Models with Ensemble of Critics
Sajad Mousavi
Ricardo Luna Gutierrez
Desik Rengarajan
Vineet Gundecha
Ashwin Ramesh Babu
Avisek Naug
Antonio Guillen-Perez
Soumyendu Sarkar
LRMHILMKELM
39
7
0
28 Oct 2023
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive
  Critiquing
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing
Zhibin Gou
Zhihong Shao
Yeyun Gong
Yelong Shen
Yujiu Yang
Nan Duan
Weizhu Chen
KELMLRM
81
394
0
19 May 2023
Principle-Driven Self-Alignment of Language Models from Scratch with
  Minimal Human Supervision
Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
Zhiqing Sun
Songlin Yang
Qinhong Zhou
Hongxin Zhang
Zhenfang Chen
David D. Cox
Yiming Yang
Chuang Gan
SyDaALM
99
337
0
04 May 2023
Large Language Models Can Self-Improve
Large Language Models Can Self-Improve
Jiaxin Huang
S. Gu
Le Hou
Yuexin Wu
Xuezhi Wang
Hongkun Yu
Jiawei Han
ReLMAI4MHLRM
201
612
0
20 Oct 2022
HateBERT: Retraining BERT for Abusive Language Detection in English
HateBERT: Retraining BERT for Abusive Language Detection in English
Tommaso Caselli
Valerio Basile
Jelena Mitrović
Michael Granitzer
82
373
0
23 Oct 2020
1