
v1v2 (latest)
Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks
Papers citing "Prefix Guidance: A Steering Wheel for Large Language Models to Defend Against Jailbreak Attacks"
7 / 7 papers shown
Title |
---|
![]() Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations Hakan Inan Kartikeya Upasani Jianfeng Chi Rashi Rungta Krithika Iyer ...Michael Tontchev Qing Hu Brian Fuller Davide Testuggine Madian Khabsa |