Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.14595
Cited By
Adversaries Can Misuse Combinations of Safe Models
20 June 2024
Erik Jones
Anca Dragan
Jacob Steinhardt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adversaries Can Misuse Combinations of Safe Models"
7 / 7 papers shown
Title
Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Christian Schroeder de Witt
AAML
AI4CE
147
0
0
04 May 2025
Superintelligence Strategy: Expert Version
Dan Hendrycks
Eric Schmidt
Alexandr Wang
59
1
0
07 Mar 2025
Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
A. Feder Cooper
Christopher A. Choquette-Choo
Miranda Bogen
Matthew Jagielski
Katja Filippova
...
Abigail Z. Jacobs
Andreas Terzis
Hanna M. Wallach
Nicolas Papernot
Katherine Lee
AILaw
MU
93
10
0
09 Dec 2024
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
Wenxuan Wang
Kuiyi Gao
Zihan Jia
Youliang Yuan
Jen-tse Huang
Qiuzhi Liu
Shuai Wang
Wenxiang Jiao
Zhaopeng Tu
121
2
0
04 Oct 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
39
12
0
10 Jul 2024
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan
Erik Jones
Meena Jagadeesan
Jacob Steinhardt
KELM
47
26
0
09 Feb 2024
Generative Agents: Interactive Simulacra of Human Behavior
J. Park
Joseph C. O'Brien
Carrie J. Cai
Meredith Ringel Morris
Percy Liang
Michael S. Bernstein
LM&Ro
AI4CE
232
1,734
0
07 Apr 2023
1