ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.16444
  4. Cited By
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable
  Safety Detectors

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

26 February 2024
Zhexin Zhang
Yida Lu
Jingyuan Ma
Di Zhang
Rui Li
Pei Ke
Hao Sun
Lei Sha
Zhifang Sui
Hongning Wang
Minlie Huang
ArXiv (abs)PDFHTMLGithub (191★)

Papers citing "ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors"

3 / 3 papers shown
Title
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Rui Li
Peiyi Wang
Jingyuan Ma
Di Zhang
Lei Sha
Zhifang Sui
LLMAG
154
0
0
22 Feb 2025
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path
  Forward
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
125
6
0
12 Apr 2024
Baichuan 2: Open Large-scale Language Models
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELMLRM
320
755
0
19 Sep 2023
1