ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.02987
  4. Cited By
LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content
  Moderation of Large Language Models

LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models

3 July 2024
Hayder Elesedy
Pedro M. Esperança
Silviu Vlad Oprea
Mete Ozay
    KELM
ArXiv (abs)PDFHTML

Papers citing "LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models"

8 / 8 papers shown
Title
Guardians and Offenders: A Survey on Harmful Content Generation and Safety Mitigation of LLM
Guardians and Offenders: A Survey on Harmful Content Generation and Safety Mitigation of LLM
Chi Zhang
Changjia Zhu
Junjie Xiong
Xiaoran Xu
Jinkui Chi
Yao Liu
Zhuo Lu
ELM
135
3
0
07 Aug 2025
Shape it Up! Restoring LLM Safety during Finetuning
Shape it Up! Restoring LLM Safety during Finetuning
ShengYun Peng
Pin-Yu Chen
Jianfeng Chi
Seongmin Lee
Duen Horng Chau
LLMAG
196
3
0
22 May 2025
No Free Lunch with Guardrails
No Free Lunch with Guardrails
Divyanshu Kumar
Nitin Aravind Birur
Tanay Baswa
Sahil Agarwal
P. Harshangi
241
5
0
01 Apr 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
431
1
0
09 Oct 2024
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
M. Russinovich
Ahmed Salem
Ronen Eldan
382
177
0
02 Apr 2024
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive AttacksInternational Conference on Learning Representations (ICLR), 2024
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
AAML
480
340
0
02 Apr 2024
Defending Jailbreak Prompts via In-Context Adversarial Game
Defending Jailbreak Prompts via In-Context Adversarial Game
Yujun Zhou
Yufei Han
Haomin Zhuang
Kehan Guo
Zhenwen Liang
Hongyan Bao
Xiangliang Zhang
LLMAGAAML
348
25
0
20 Feb 2024
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuningIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023
Yun Luo
Zhen Yang
Fandong Meng
Yafu Li
Jie Zhou
Yue Zhang
CLLKELM
541
474
0
17 Aug 2023
1