Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.10619
Cited By
Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search
13 March 2025
Andy Zhou
Ron Arel
MU
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search"
7 / 7 papers shown
Title
Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
Maya Pavlova
Erik Brinkman
Krithika Iyer
Vítor Albiero
Joanna Bitton
Hailey Nguyen
Jingkai Li
Cristian Canton Ferrer
Ivan Evtimov
Aaron Grattafiori
ALM
46
9
0
02 Oct 2024
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking
Yifan Jiang
Kriti Aggarwal
Tanmay Laud
Kashif Munir
Jay Pujara
Subhabrata Mukherjee
AAML
72
13
0
26 Sep 2024
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
M. Russinovich
Ahmed Salem
Ronen Eldan
76
86
0
02 Apr 2024
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
AAML
116
186
0
02 Apr 2024
Coercing LLMs to do and reveal (almost) anything
Jonas Geiping
Alex Stein
Manli Shu
Khalid Saifullah
Yuxin Wen
Tom Goldstein
AAML
62
47
0
21 Feb 2024
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Yi Zeng
Hongpeng Lin
Jingwen Zhang
Diyi Yang
Ruoxi Jia
Weiyan Shi
53
284
0
12 Jan 2024
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou
Zifan Wang
Nicholas Carlini
Milad Nasr
J. Zico Kolter
Matt Fredrikson
150
1,376
0
27 Jul 2023
1