Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.16444
Cited By
ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors
26 February 2024
Zhexin Zhang
Yida Lu
Jingyuan Ma
Di Zhang
Rui Li
Pei Ke
Hao Sun
Lei Sha
Zhifang Sui
Hongning Wang
Minlie Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Github (191★)
Papers citing
"ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors"
3 / 3 papers shown
Title
Be a Multitude to Itself: A Prompt Evolution Framework for Red Teaming
Rui Li
Peiyi Wang
Jingyuan Ma
Di Zhang
Lei Sha
Zhifang Sui
LLMAG
154
0
0
22 Feb 2025
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
125
6
0
12 Apr 2024
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Guosheng Dong
Zhiying Wu
ELM
LRM
320
755
0
19 Sep 2023
1