Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.02987
Cited By
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models
3 July 2024
Hayder Elesedy
Pedro M. Esperança
Silviu Vlad Oprea
Mete Ozay
KELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models"
6 / 6 papers shown
Title
No Free Lunch with Guardrails
Divyanshu Kumar
Nitin Aravind Birur
Tanay Baswa
Sahil Agarwal
P. Harshangi
67
1
0
01 Apr 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
87
1
0
09 Oct 2024
Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
Zhichen Dong
Zhanhui Zhou
Chao Yang
Jing Shao
Yu Qiao
ELM
58
59
0
14 Feb 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
126
320
0
19 Sep 2023
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
457
12,345
0
04 Mar 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
298
3,906
0
18 Apr 2021
1