Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.02928
Cited By
Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models
2 April 2024
Jiachen Ma
Anda Cao
Zhiqing Xiao
Jie Zhang
Chaonan Ye
Chao Ye
Junbo Zhao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models"
24 / 24 papers shown
Title
Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Qingbin Liu
Zhaoxin Wang
Handing Wang
Cong Tian
Yaochu Jin
30
0
0
15 Apr 2025
Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings
Zonghao Ying
Guangyi Zheng
Yongxin Huang
Deyue Zhang
Wenxin Zhang
Quanchen Zou
Aishan Liu
Xianglong Liu
Dacheng Tao
ELM
87
9
0
19 Mar 2025
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
Isha Gupta
David Khachaturov
Robert D. Mullins
AAML
AuLLM
79
2
0
02 Feb 2025
Mitigating Adversarial Attacks in LLMs through Defensive Suffix Generation
Minkyoung Kim
Yunha Kim
Hyeram Seo
Heejung Choi
Jiye Han
...
Hyoje Jung
Byeolhee Kim
Young-Hak Kim
Sanghyun Park
Tae Joon Jun
AAML
98
0
0
18 Dec 2024
Safeguarding Text-to-Image Generation via Inference-Time Prompt-Noise Optimization
Jiangweizhi Peng
Zhiwei Tang
Gaowen Liu
Charles Fleming
Mingyi Hong
94
3
0
05 Dec 2024
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
67
8
0
14 Nov 2024
AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models
Yaopei Zeng
Yuanpu Cao
Bochuan Cao
Yurui Chang
Jinghui Chen
Lu Lin
DiffM
46
3
0
28 Oct 2024
Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts
Hongcheng Gao
Tianyu Pang
Chao Du
Taihang Hu
Zhijie Deng
Min Lin
DiffM
52
10
0
16 Oct 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Joey Tianyi Zhou
85
21
0
16 Oct 2024
Unstable Unlearning: The Hidden Risk of Concept Resurgence in Diffusion Models
Vinith Suriyakumar
Rohan Alur
Ayush Sekhari
Manish Raghavan
Ashia Wilson
61
3
0
10 Oct 2024
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning
Saemi Moon
M. Lee
Sangdon Park
Dongwoo Kim
44
2
0
08 Oct 2024
SteerDiff: Steering towards Safe Text-to-Image Diffusion Models
Hongxiang Zhang
Yifeng He
Hao Chen
38
4
0
03 Oct 2024
Perception-guided Jailbreak against Text-to-Image Models
Yihao Huang
Le Liang
Tianlin Li
Xiaojun Jia
Run Wang
Weikai Miao
G. Pu
Yang Liu
46
7
0
20 Aug 2024
Enhance Modality Robustness in Text-Centric Multimodal Alignment with Adversarial Prompting
Yun-Da Tsai
Ting-Yu Yen
Keng-Te Liao
Shou-De Lin
52
2
0
19 Aug 2024
DiffZOO: A Purely Query-Based Black-Box Attack for Red-teaming Text-to-Image Generative Model via Zeroth Order Optimization
Pucheng Dang
Xing Hu
Dong Li
Rui Zhang
Qi Guo
Kaidi Xu
DiffM
52
5
0
18 Aug 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
53
15
0
10 Jul 2024
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models
Yibo Miao
Yifan Zhu
Yinpeng Dong
Lijia Yu
Jun Zhu
Xiao-Shan Gao
EGVM
55
13
0
08 Jul 2024
Lockpicking LLMs: A Logit-Based Jailbreak Using Token-level Manipulation
Yuxi Li
Yi Liu
Yuekang Li
Ling Shi
Gelei Deng
Shengquan Chen
Kailong Wang
71
12
0
20 May 2024
Espresso: Robust Concept Filtering in Text-to-Image Models
Anudeep Das
Vasisht Duddu
Rui Zhang
Nadarajah Asokan
EGVM
43
7
0
30 Apr 2024
Red-Teaming the Stable Diffusion Safety Filter
Javier Rando
Daniel Paleka
David Lindner
Lennard Heim
Florian Tramèr
DiffM
135
188
0
03 Oct 2022
Discovering the Hidden Vocabulary of DALLE-2
Giannis Daras
A. Dimakis
152
66
0
01 Jun 2022
Label-Efficient Semantic Segmentation with Diffusion Models
Dmitry Baranchuk
Ivan Rubachev
A. Voynov
Valentin Khrulkov
Artem Babenko
DiffM
VLM
195
524
0
06 Dec 2021
Crystal Diffusion Variational Autoencoder for Periodic Material Generation
Tian Xie
Xiang Fu
O. Ganea
Regina Barzilay
Tommi Jaakkola
DiffM
BDL
212
238
0
12 Oct 2021
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
112
232
0
15 Apr 2021
1