Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.06135
Cited By
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
12 September 2023
Zhi-Yi Chin
Chieh-Ming Jiang
Ching-Chun Huang
Pin-Yu Chen
Wei-Chen Chiu
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts"
27 / 27 papers shown
Title
The Dual Power of Interpretable Token Embeddings: Jailbreaking Attacks and Defenses for Diffusion Model Unlearning
Siyi Chen
Yimeng Zhang
Sijia Liu
Q. Qu
AAML
147
0
0
30 Apr 2025
Erased but Not Forgotten: How Backdoors Compromise Concept Erasure
Jonas Henry Grebe
Tobias Braun
Marcus Rohrbach
Anna Rohrbach
AAML
85
0
0
29 Apr 2025
On the Vulnerability of Concept Erasure in Diffusion Models
Lucas Beerens
Alex D. Richardson
Peng Sun
Dongdong Chen
DiffM
65
2
0
24 Feb 2025
A Systematic Review of Open Datasets Used in Text-to-Image (T2I) Gen AI Model Safety
Rakeen Rouf
Trupti Bavalatti
Osama Ahmed
Dhaval Potdar
Faraz Jawed
EGVM
66
1
0
23 Feb 2025
Concept Corrector: Erase concepts on the fly for text-to-image diffusion models
Zheling Meng
Bo Peng
Xiaochuan Jin
Yueming Lyu
Wei Wang
Jing Dong
DiffM
48
2
0
22 Feb 2025
A Comprehensive Survey on Concept Erasure in Text-to-Image Diffusion Models
Changhoon Kim
Yanjun Qi
DiffM
45
1
0
17 Feb 2025
CE-SDWV: Effective and Efficient Concept Erasure for Text-to-Image Diffusion Models via a Semantic-Driven Word Vocabulary
Jiahang Tu
Qian Feng
Chufan Chen
Jiahua Dong
Hanbin Zhao
Chao Zhang
Hui Qian
72
2
0
28 Jan 2025
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
Yong-Hyun Park
Sangdoo Yun
Jin-Hwa Kim
Junho Kim
Geonhui Jang
Yonghyun Jeong
Junghyo Jo
Gayoung Lee
76
13
0
17 Jan 2025
MLLM-as-a-Judge for Image Safety without Human Labeling
Zhenting Wang
Shuming Hu
Shiyu Zhao
Xiaowen Lin
F. Xu
...
Nan Jiang
Lingjuan Lyu
Shiqing Ma
Dimitris N. Metaxas
Ankit Jain
164
2
0
31 Dec 2024
Continuous Concepts Removal in Text-to-image Diffusion Models
Tingxu Han
Dongrui Liu
Yanrong Hu
Chunrong Fang
Yonglong Zhang
Shiqing Ma
Tao Zheng
Zhenyu Chen
Zhenting Wang
DiffM
114
2
0
30 Nov 2024
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models
Zhi-Yi Chin
Kuan-Chen Mu
Mario Fritz
Pin-Yu Chen
DiffM
87
0
0
25 Nov 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Joey Tianyi Zhou
79
15
0
16 Oct 2024
Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning
Saemi Moon
M. Lee
Sangdon Park
Dongwoo Kim
44
1
0
08 Oct 2024
DiffZOO: A Purely Query-Based Black-Box Attack for Red-teaming Text-to-Image Generative Model via Zeroth Order Optimization
Pucheng Dang
Xing Hu
Dong Li
Rui Zhang
Qi Guo
Kaidi Xu
DiffM
36
5
0
18 Aug 2024
HateSieve: A Contrastive Learning Framework for Detecting and Segmenting Hateful Content in Multimodal Memes
Xuanyu Su
Yansong Li
Diana Inkpen
Nathalie Japkowicz
VLM
81
2
0
11 Aug 2024
Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey
V. T. Truong
Luan Ba Dang
Long Bao Le
DiffM
MedIm
56
16
0
06 Aug 2024
Jailbreaking Text-to-Image Models with LLM-Based Agents
Yingkai Dong
Zheng Li
Xiangtao Meng
Ning Yu
Shanqing Guo
LLMAG
45
13
0
01 Aug 2024
RIGID: A Training-free and Model-Agnostic Framework for Robust AI-Generated Image Detection
Zhiyuan He
Pin-Yu Chen
Tsung-Yi Ho
44
12
0
30 May 2024
Distilling Adversarial Prompts from Safety Benchmarks: Report for the Adversarial Nibbler Challenge
Manuel Brack
P. Schramowski
Kristian Kersting
AAML
EGVM
29
7
0
20 Sep 2023
Red-Teaming the Stable Diffusion Safety Filter
Javier Rando
Daniel Paleka
David Lindner
Lennard Heim
Florian Tramèr
DiffM
129
183
0
03 Oct 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
328
11,953
0
04 Mar 2022
Constrained Language Models Yield Few-Shot Semantic Parsers
Richard Shin
C. H. Lin
Sam Thomson
Charles C. Chen
Subhro Roy
Emmanouil Antonios Platanios
Adam Pauls
Dan Klein
J. Eisner
Benjamin Van Durme
295
198
0
18 Apr 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,848
0
18 Apr 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
255
4,781
0
24 Feb 2021
Making Pre-trained Language Models Better Few-shot Learners
Tianyu Gao
Adam Fisch
Danqi Chen
243
1,919
0
31 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,589
0
21 Jan 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
417
2,588
0
03 Sep 2019
1