Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.09132
Cited By
Exploring the Adversarial Capabilities of Large Language Models
14 February 2024
Lukas Struppek
Minh Hieu Le
Dominik Hintersdorf
Kristian Kersting
ELM
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring the Adversarial Capabilities of Large Language Models"
5 / 5 papers shown
Title
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Yunhan Zhao
Xiang Zheng
Lin Luo
Yige Li
Xingjun Ma
Yu-Gang Jiang
VLM
AAML
62
3
0
28 Oct 2024
Jailbreak Attacks and Defenses Against Large Language Models: A Survey
Sibo Yi
Yule Liu
Zhen Sun
Tianshuo Cong
Xinlei He
Jiaxing Song
Ke Xu
Qi Li
AAML
39
82
0
05 Jul 2024
LLMs for Cyber Security: New Opportunities
D. Divakaran
Sai Teja Peddinti
24
11
0
17 Apr 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
339
12,003
0
04 Mar 2022
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,824
0
14 Dec 2020
1