Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2408.11182
Cited By
Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles
20 August 2024
Zhilong Wang
Haizhou Wang
Nanqing Luo
Lan Zhang
Xiaoyan Sun
Yebo Cao
Peng Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hide Your Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Carrier Articles"
1 / 1 papers shown
Title
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
Yang Ouyang
Hengrui Gu
Shuhang Lin
Wenyue Hua
Jie Peng
B. Kailkhura
Tianlong Chen
Kaixiong Zhou
Kaixiong Zhou
AAML
31
1
0
05 Jan 2025
1