
Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models
Papers citing "Latent Jailbreak: A Benchmark for Evaluating Text Safety and Output Robustness of Large Language Models"
15 / 15 papers shown
Title |
---|
![]() AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Xiaogeng Liu Peiran Li Edward Suh Yevgeniy Vorobeychik Zhuoqing Mao Somesh Jha Patrick McDaniel Huan Sun Bo Li Chaowei Xiao |