Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.14539
Cited By
Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
26 July 2023
Erfan Shayegani
Yue Dong
Nael B. Abu-Ghazaleh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models"
50 / 106 papers shown
Title
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Hanxun Huang
Sarah Monazam Erfani
Yige Li
Xingjun Ma
James Bailey
AAML
44
0
0
08 May 2025
REVEAL: Multi-turn Evaluation of Image-Input Harms for Vision LLM
Madhur Jindal
Saurabh Deshpande
AAML
45
0
0
07 May 2025
Adversarial Robustness Analysis of Vision-Language Models in Medical Image Segmentation
Anjila Budathoki
Manish Dhakal
AAML
36
0
0
05 May 2025
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
Xiaozhong Liu
Hangyu Guo
Ranjie Duan
Xingyuan Bu
Yancheng He
...
Yingshui Tan
Yanan Wu
Jihao Gu
Heng Chang
Jun Zhu
MLLM
154
0
0
25 Apr 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
33
1
0
19 Apr 2025
QAVA: Query-Agnostic Visual Attack to Large Vision-Language Models
Yudong Zhang
Ruobing Xie
Jiansheng Chen
Xingchen Sun
Zhanhui Kang
Yu Wang
AAML
31
0
0
15 Apr 2025
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?
Yanbo Wang
Jiyang Guan
Jian Liang
Ran He
49
0
0
14 Apr 2025
AdaSteer: Your Aligned LLM is Inherently an Adaptive Jailbreak Defender
Weixiang Zhao
Jiahe Guo
Yulin Hu
Yang Deng
An Zhang
...
Xinyang Han
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
AAML
LLMSV
43
0
0
13 Apr 2025
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models
Carlos Peláez-González
Andrés Herrera-Poyatos
Cristina Zuheros
David Herrera-Poyatos
Virilo Tejedor
F. Herrera
AAML
24
0
0
07 Apr 2025
More is Less: The Pitfalls of Multi-Model Synthetic Preference Data in DPO Safety Alignment
Yifan Wang
Runjin Chen
Bolian Li
David Cho
Yihe Deng
Ruqi Zhang
Tianlong Chen
Zhangyang Wang
A. Grama
Junyuan Hong
SyDa
53
0
0
03 Apr 2025
PiCo: Jailbreaking Multimodal Large Language Models via
Pi
\textbf{Pi}
Pi
ctorial
Co
\textbf{Co}
Co
de Contextualization
Aofan Liu
Lulu Tang
Ting Pan
Yuguo Yin
Bin Wang
Ao Yang
MLLM
AAML
45
0
0
02 Apr 2025
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Jiawei Wang
Yushen Zuo
Yuanjun Chai
Ziqiang Liu
Yichen Fu
Yichun Feng
Kin-Man Lam
AAML
VLM
42
0
0
02 Apr 2025
AdPO: Enhancing the Adversarial Robustness of Large Vision-Language Models with Preference Optimization
Chaohu Liu
Tianyi Gui
Yu Liu
Linli Xu
VLM
AAML
68
1
0
02 Apr 2025
Emerging Cyber Attack Risks of Medical AI Agents
Jianing Qiu
Lin Li
Jiankai Sun
Hao Wei
Zhe Xu
K. Lam
Wu Yuan
AAML
28
1
0
02 Apr 2025
Misaligned Roles, Misplaced Images: Structural Input Perturbations Expose Multimodal Alignment Blind Spots
Erfan Shayegani
G M Shahariar
Sara Abdali
Lei Yu
Nael B. Abu-Ghazaleh
Yue Dong
AAML
78
0
0
01 Apr 2025
Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy
Joonhyun Jeong
Seyun Bae
Yeonsung Jung
Jaeryong Hwang
Eunho Yang
AAML
43
0
0
26 Mar 2025
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks
Wenhao You
Bryan Hooi
Yiwei Wang
Yixuan Wang
Zong Ke
Ming Yang
Zi Huang
Yujun Cai
AAML
56
0
0
24 Mar 2025
Web Artifact Attacks Disrupt Vision Language Models
Maan Qraitem
Piotr Teterwak
Kate Saenko
Bryan A. Plummer
AAML
80
0
0
17 Mar 2025
Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
Shuyang Hao
Yiwei Wang
Bryan Hooi
Jiaheng Liu
Muhao Chen
Zi Huang
Yujun Cai
AAML
VLM
67
0
0
14 Mar 2025
Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense
Shuyang Hao
Y. Wang
Bryan Hooi
Ming Yang
Jiaheng Liu
Chengcheng Tang
Zi Huang
Yujun Cai
AAML
54
0
0
14 Mar 2025
Utilizing Jailbreak Probability to Attack and Safeguard Multimodal LLMs
Wenzhuo Xu
Zhipeng Wei
Xiongtao Sun
Deyue Zhang
Dongdong Yang
Quanchen Zou
Xinming Zhang
AAML
52
0
0
10 Mar 2025
TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models
Ruidong Chen
Honglin Guo
Lanjun Wang
Chenyu Zhang
Weizhi Nie
An-an Liu
DiffM
66
1
0
10 Mar 2025
CeTAD: Towards Certified Toxicity-Aware Distance in Vision Language Models
Xiangyu Yin
Jiaxu Liu
Zhen Chen
Jinwei Hu
Yi Dong
Xiaowei Huang
Wenjie Ruan
AAML
47
0
0
08 Mar 2025
CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP
Songlong Xing
Zhengyu Zhao
N. Sebe
AAML
62
0
0
05 Mar 2025
FC-Attack: Jailbreaking Large Vision-Language Models via Auto-Generated Flowcharts
Ziyi Zhang
Zhen Sun
Zhe Zhang
Jihui Guo
Xinlei He
AAML
52
2
0
28 Feb 2025
EigenShield: Causal Subspace Filtering via Random Matrix Theory for Adversarially Robust Vision-Language Models
Nastaran Darabi
Devashri Naik
Sina Tayebati
Dinithi Jayasuriya
Ranganath Krishnan
A. R. Trivedi
AAML
52
0
0
24 Feb 2025
Unified Prompt Attack Against Text-to-Image Generation Models
Duo Peng
Qiuhong Ke
Mark He Huang
Ping Hu
Xiaozhong Liu
48
0
0
23 Feb 2025
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images
Yubo Wang
Jianting Tang
Chaohu Liu
Linli Xu
AAML
61
1
0
23 Feb 2025
Understanding and Rectifying Safety Perception Distortion in VLMs
Xiaohan Zou
Jian Kang
George Kesidis
Lu Lin
178
1
0
18 Feb 2025
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning
Junkai Chen
Zhijie Deng
Kening Zheng
Yibo Yan
Shuliang Liu
PeiJun Wu
Peijie Jiang
Jiaheng Liu
Xuming Hu
MU
66
3
0
18 Feb 2025
Multi-Faceted Multimodal Monosemanticity
Hanqi Yan
Xiangxiang Cui
Lu Yin
Paul Pu Liang
Yulan He
Yifei Wang
44
0
0
16 Feb 2025
Distraction is All You Need for Multimodal Large Language Model Jailbreaking
Zuopeng Yang
Jiluan Fan
Anli Yan
Erdun Gao
Xin Lin
Tao Li
Kanghua mo
Changyu Dong
AAML
77
0
0
15 Feb 2025
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
H. Malik
Fahad Shamshad
Muzammal Naseer
Karthik Nandakumar
F. Khan
Salman Khan
AAML
MLLM
VLM
68
0
0
03 Feb 2025
"I am bad": Interpreting Stealthy, Universal and Robust Audio Jailbreaks in Audio-Language Models
Isha Gupta
David Khachaturov
Robert D. Mullins
AAML
AuLLM
65
1
0
02 Feb 2025
Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
Shuyang Hao
Bryan Hooi
Jiaheng Liu
Kai-Wei Chang
Zi Huang
Yujun Cai
AAML
92
0
0
27 Nov 2024
Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation
Fengfan Zhou
Bangjie Yin
Hefei Ling
Qianyu Zhou
Wenxuan Wang
AAML
66
0
0
23 Nov 2024
SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach
Ruoxi Sun
Jiamin Chang
Hammond Pearce
Chaowei Xiao
B. Li
Qi Wu
Surya Nepal
Minhui Xue
35
0
0
17 Nov 2024
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
65
6
0
14 Nov 2024
New Emerged Security and Privacy of Pre-trained Model: a Survey and Outlook
Meng Yang
Tianqing Zhu
Chi Liu
Wanlei Zhou
Shui Yu
Philip S. Yu
AAML
ELM
PILM
61
1
0
12 Nov 2024
Unfair Alignment: Examining Safety Alignment Across Vision Encoder Layers in Vision-Language Models
Saketh Bachu
Erfan Shayegani
Trishna Chakraborty
Rohit Lal
Arindam Dutta
Chengyu Song
Yue Dong
Nael B. Abu-Ghazaleh
A. Roy-Chowdhury
34
0
0
06 Nov 2024
UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Models
Sejoon Oh
Yiqiao Jin
Megha Sharma
Donghyun Kim
Eric Ma
Gaurav Verma
Srijan Kumar
62
6
0
03 Nov 2024
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
Hao Yang
Lizhen Qu
Ehsan Shareghi
Gholamreza Haffari
AAML
36
3
0
31 Oct 2024
Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector
Youcheng Huang
Fengbin Zhu
Jingkun Tang
Pan Zhou
Wenqiang Lei
Jiancheng Lv
Tat-Seng Chua
AAML
31
4
0
30 Oct 2024
CLEAR: Character Unlearning in Textual and Visual Modalities
Alexey Dontsov
Dmitrii Korzh
Alexey Zhavoronkin
Boris Mikheev
Denis Bobkov
Aibek Alanov
Oleg Y. Rogov
Ivan V. Oseledets
Elena Tutubalina
AILaw
VLM
MU
66
5
0
23 Oct 2024
Bayesian scaling laws for in-context learning
Aryaman Arora
Dan Jurafsky
Christopher Potts
Noah D. Goodman
24
2
0
21 Oct 2024
Hiding-in-Plain-Sight (HiPS) Attack on CLIP for Targetted Object Removal from Images
Arka Daw
Megan Hong-Thanh Chung
Maria Mahbub
Amir Sadovnik
AAML
39
0
0
16 Oct 2024
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Jaehong Yoon
Shoubin Yu
Vaidehi Patil
Huaxiu Yao
Joey Tianyi Zhou
79
15
0
16 Oct 2024
Break the Visual Perception: Adversarial Attacks Targeting Encoded Visual Tokens of Large Vision-Language Models
Yubo Wang
Chaohu Liu
Yanqiu Qu
Haoyu Cao
Deqiang Jiang
Linli Xu
MLLM
AAML
29
3
0
09 Oct 2024
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
Wenxuan Wang
Kuiyi Gao
Zihan Jia
Youliang Yuan
Jen-tse Huang
Qiuzhi Liu
Shuai Wang
Wenxiang Jiao
Zhaopeng Tu
129
2
0
04 Oct 2024
You Know What I'm Saying: Jailbreak Attack via Implicit Reference
Tianyu Wu
Lingrui Mei
Ruibin Yuan
Lujun Li
Wei Xue
Yike Guo
48
1
0
04 Oct 2024
1
2
3
Next