
v1v2 (latest)
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
Papers citing "Tree of Attacks: Jailbreaking Black-Box LLMs Automatically"
50 / 53 papers shown
Title |
---|
![]() AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs Xiaogeng Liu Peiran Li Edward Suh Yevgeniy Vorobeychik Zhuoqing Mao Somesh Jha Patrick McDaniel Huan Sun Bo Li Chaowei Xiao |
![]() SelfDefend: LLMs Can Defend Themselves against Jailbreaking in a Practical Manner Xunguang Wang Daoyuan Wu Zhenlan Ji Zongjie Li Pingchuan Ma Shuai Wang Yingjiu Li Yang Liu Ning Liu Juergen Rahmel |
![]() Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations Hakan Inan Kartikeya Upasani Jianfeng Chi Rashi Rungta Krithika Iyer ...Michael Tontchev Qing Hu Brian Fuller Davide Testuggine Madian Khabsa |