Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.01606
Cited By
Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
2 October 2024
Maya Pavlova
Erik Brinkman
Krithika Iyer
Vítor Albiero
Joanna Bitton
Hailey Nguyen
Jingkai Li
Cristian Canton Ferrer
Ivan Evtimov
Aaron Grattafiori
ALM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Automated Red Teaming with GOAT: the Generative Offensive Agent Tester"
10 / 10 papers shown
Title
TwinBreak: Jailbreaking LLM Security Alignments based on Twin Prompts
T. Krauß
Hamid Dashtbani
Alexandra Dmitrienko
25
0
0
09 Jun 2025
Quality-Diversity Red-Teaming: Automated Generation of High-Quality and Diverse Attackers for Large Language Models
Ren-Jian Wang
Ke Xue
Zeyu Qin
Ziniu Li
Sheng Tang
Hao-Tian Li
Shengcai Liu
Chao Qian
AAML
24
0
0
08 Jun 2025
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Christian Schroeder de Witt
AAML
AI4CE
485
6
0
04 May 2025
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov
Arman Zharmagambetov
Aaron Grattafiori
Chuan Guo
Kamalika Chaudhuri
AAML
118
4
0
22 Apr 2025
Strategize Globally, Adapt Locally: A Multi-Turn Red Teaming Agent with Dual-Level Learning
Tian Jin
Xiao Yu
Ninareh Mehrabi
Rahul Gupta
Zhou Yu
Ruoxi Jia
AAML
LLMAG
110
0
0
02 Apr 2025
Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search
Andy Zhou
Ron Arel
MU
145
0
0
13 Mar 2025
Steering Dialogue Dynamics for Robustness against Multi-turn Jailbreaking Attacks
Hanjiang Hu
Alexander Robey
Changliu Liu
AAML
LLMSV
107
2
0
28 Feb 2025
TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in Practice
Aman Goel
Xian Carrie Wu
Zhe Wang
Dmitriy Bespalov
Yanjun Qi
116
0
0
21 Feb 2025
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko
Francesco Croce
Nicolas Flammarion
AAML
210
222
0
02 Apr 2024
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
M. Russinovich
Ahmed Salem
Ronen Eldan
122
98
0
02 Apr 2024
1