Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.01278
Cited By
Strategize Globally, Adapt Locally: A Multi-Turn Red Teaming Agent with Dual-Level Learning
2 April 2025
Tian Jin
Xiao Yu
Ninareh Mehrabi
Rahul Gupta
Zhou Yu
Ruoxi Jia
AAML
LLMAG
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Strategize Globally, Adapt Locally: A Multi-Turn Red Teaming Agent with Dual-Level Learning"
7 / 7 papers shown
Title
Automated Red Teaming with GOAT: the Generative Offensive Agent Tester
Maya Pavlova
Erik Brinkman
Krithika Iyer
Vítor Albiero
Joanna Bitton
Hailey Nguyen
Jingkai Li
Cristian Canton Ferrer
Ivan Evtimov
Aaron Grattafiori
ALM
72
12
0
02 Oct 2024
RED QUEEN: Safeguarding Large Language Models against Concealed Multi-Turn Jailbreaking
Yifan Jiang
Kriti Aggarwal
Tanmay Laud
Kashif Munir
Jay Pujara
Subhabrata Mukherjee
AAML
113
13
0
26 Sep 2024
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
M. Russinovich
Ahmed Salem
Ronen Eldan
116
98
0
02 Apr 2024
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
Yi Zeng
Hongpeng Lin
Jingwen Zhang
Diyi Yang
Ruoxi Jia
Weiyan Shi
97
317
0
12 Jan 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
Jiahao Yu
Xingwei Lin
Zheng Yu
Xinyu Xing
SILM
221
352
0
19 Sep 2023
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou
Zifan Wang
Nicholas Carlini
Milad Nasr
J. Zico Kolter
Matt Fredrikson
297
1,518
0
27 Jul 2023
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
108
272
0
31 Dec 2020
1