Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.04849
Cited By
Hidden You Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Logic Chain Injection
7 April 2024
Zhilong Wang
Yebo Cao
Peng Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hidden You Malicious Goal Into Benign Narratives: Jailbreak Large Language Models through Logic Chain Injection"
2 / 2 papers shown
Title
Using Mechanistic Interpretability to Craft Adversarial Attacks against Large Language Models
Thomas Winninger
Boussad Addad
Katarzyna Kapusta
AAML
100
1
0
08 Mar 2025
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
166
341
0
19 Sep 2023
1