ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2203.03072
  4. Cited By
Leashing the Inner Demons: Self-Detoxification for Language Models

Leashing the Inner Demons: Self-Detoxification for Language Models

6 March 2022
Canwen Xu
Zexue He
Zhankui He
Julian McAuley
ArXivPDFHTML

Papers citing "Leashing the Inner Demons: Self-Detoxification for Language Models"

8 / 8 papers shown
Title
Combating Toxic Language: A Review of LLM-Based Strategies for Software Engineering
Combating Toxic Language: A Review of LLM-Based Strategies for Software Engineering
Hao Zhuo
Yicheng Yang
Kewen Peng
30
0
0
21 Apr 2025
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots
LLM-Personalize: Aligning LLM Planners with Human Preferences via Reinforced Self-Training for Housekeeping Robots
Dongge Han
Trevor A. McInroe
Adam Jelley
Stefano V. Albrecht
Peter Bell
Amos Storkey
61
11
0
31 Dec 2024
FrenchToxicityPrompts: a Large Benchmark for Evaluating and Mitigating
  Toxicity in French Texts
FrenchToxicityPrompts: a Large Benchmark for Evaluating and Mitigating Toxicity in French Texts
Caroline Brun
Vassilina Nikoulina
40
1
0
25 Jun 2024
CMD: a framework for Context-aware Model self-Detoxification
CMD: a framework for Context-aware Model self-Detoxification
Zecheng Tang
Keyan Zhou
Juntao Li
Yuyang Ding
Pinzheng Wang
Bowen Yan
Minzhang
MU
25
5
0
16 Aug 2023
Synthetic Pre-Training Tasks for Neural Machine Translation
Synthetic Pre-Training Tasks for Neural Machine Translation
Zexue He
Graeme W. Blackwood
Yikang Shen
Julian McAuley
Rogerio Feris
29
3
0
19 Dec 2022
Controlling Bias Exposure for Fair Interpretable Predictions
Controlling Bias Exposure for Fair Interpretable Predictions
Zexue He
Yu Wang
Julian McAuley
Bodhisattwa Prasad Majumder
27
19
0
14 Oct 2022
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain
  Chatbots
Why So Toxic? Measuring and Triggering Toxic Behavior in Open-Domain Chatbots
Waiman Si
Michael Backes
Jeremy Blackburn
Emiliano De Cristofaro
Gianluca Stringhini
Savvas Zannettou
Yang Zhang
36
58
0
07 Sep 2022
The Woman Worked as a Babysitter: On Biases in Language Generation
The Woman Worked as a Babysitter: On Biases in Language Generation
Emily Sheng
Kai-Wei Chang
Premkumar Natarajan
Nanyun Peng
225
621
0
03 Sep 2019
1