Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.05608
Cited By
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
9 November 2023
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts"
50 / 101 papers shown
Title
Video-SafetyBench: A Benchmark for Safety Evaluation of Video LVLMs
Xuannan Liu
Zekun Li
Zheqi He
Peipei Li
Shuhan Xia
Xing Cui
Huaibo Huang
Xi Yang
Ran He
EGVM
AAML
23
0
0
17 May 2025
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yong-Jin Liu
Shengfang Zhai
Mingzhe Du
Yulin Chen
Tri Cao
...
Xuzhao Li
Kun Wang
Junfeng Fang
Jiaheng Zhang
Bryan Hooi
OffRL
LRM
12
0
0
16 May 2025
Think in Safety: Unveiling and Mitigating Safety Alignment Collapse in Multimodal Large Reasoning Model
Xinyue Lou
You Li
Jinan Xu
Xiangyu Shi
Chong Chen
Kaiyu Huang
LRM
28
0
0
10 May 2025
X-Transfer Attacks: Towards Super Transferable Adversarial Attacks on CLIP
Hanxun Huang
Sarah Monazam Erfani
Yige Li
Xingjun Ma
James Bailey
AAML
53
0
0
08 May 2025
REVEAL: Multi-turn Evaluation of Image-Input Harms for Vision LLM
Madhur Jindal
Saurabh Deshpande
AAML
45
0
0
07 May 2025
"I Can See Forever!": Evaluating Real-time VideoLLMs for Assisting Individuals with Visual Impairments
Zhe Zhang
Zhen Sun
Zhenru Zhang
Zifan Peng
Yuemeng Zhao
Zihan Wang
Zeren Luo
Ruiting Zuo
Xinlei He
42
0
0
07 May 2025
DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
Xiaozhong Liu
Hangyu Guo
Ranjie Duan
Xingyuan Bu
Yancheng He
...
Yingshui Tan
Yanan Wu
Jihao Gu
Heng Chang
Jun Zhu
MLLM
184
0
0
25 Apr 2025
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
AAML
33
1
0
19 Apr 2025
DoomArena: A framework for Testing AI Agents Against Evolving Security Threats
Léo Boisvert
Mihir Bansal
Chandra Kiran Reddy Evuru
Gabriel Huang
Abhay Puri
...
Quentin Cappart
Jason Stanley
Alexandre Lacoste
Alexandre Drouin
Krishnamurthy Dvijotham
35
0
0
18 Apr 2025
VLMGuard-R1: Proactive Safety Alignment for VLMs via Reasoning-Driven Prompt Optimization
Menglan Chen
Xianghe Pang
Jingjing Dong
Wenhao Wang
Yaxin Du
Siheng Chen
LRM
39
0
0
17 Apr 2025
Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?
Yanbo Wang
Jiyang Guan
Jian Liang
Ran He
56
0
0
14 Apr 2025
The Structural Safety Generalization Problem
Julius Broomfield
Tom Gibbs
Ethan Kosak-Hine
George Ingebretsen
Tia Nasir
Jason Zhang
Reihaneh Iranmanesh
Sara Pieri
Reihaneh Rabbany
Kellin Pelrine
AAML
35
0
0
13 Apr 2025
SafeMLRM: Demystifying Safety in Multi-modal Large Reasoning Models
Junfeng Fang
Yansen Wang
Ruipeng Wang
Zijun Yao
Kun Wang
An Zhang
Xuben Wang
Tat-Seng Chua
AAML
LRM
73
3
0
09 Apr 2025
SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models
Justus Westerhoff
Erblina Purellku
Jakob Hackstein
Jonas Loos
Leo Pinetzki
Lorenz Hufe
AAML
28
0
0
07 Apr 2025
A Domain-Based Taxonomy of Jailbreak Vulnerabilities in Large Language Models
Carlos Peláez-González
Andrés Herrera-Poyatos
Cristina Zuheros
David Herrera-Poyatos
Virilo Tejedor
F. Herrera
AAML
24
0
0
07 Apr 2025
PiCo: Jailbreaking Multimodal Large Language Models via
Pi
\textbf{Pi}
Pi
ctorial
Co
\textbf{Co}
Co
de Contextualization
Aofan Liu
Lulu Tang
Ting Pan
Yuguo Yin
Bin Wang
Ao Yang
MLLM
AAML
56
0
0
02 Apr 2025
Emerging Cyber Attack Risks of Medical AI Agents
Jianing Qiu
Lin Li
Jiankai Sun
Hao Wei
Zhe Xu
K. Lam
Wu Yuan
AAML
33
2
0
02 Apr 2025
Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks
Jiawei Wang
Yushen Zuo
Yuanjun Chai
Ziqiang Liu
Yichen Fu
Yichun Feng
Kin-Man Lam
AAML
VLM
47
0
0
02 Apr 2025
ShieldGemma 2: Robust and Tractable Image Content Moderation
Wenjun Zeng
D. Kurniawan
Ryan Mullins
Yuchi Liu
Tamoghna Saha
...
Mani Malek
Hamid Palangi
Joon Baek
Rick Pereira
Karthik Narasimhan
AI4MH
36
0
0
01 Apr 2025
Misaligned Roles, Misplaced Images: Structural Input Perturbations Expose Multimodal Alignment Blind Spots
Erfan Shayegani
G M Shahariar
Sara Abdali
Lei Yu
Nael B. Abu-Ghazaleh
Yue Dong
AAML
78
0
0
01 Apr 2025
Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy
Joonhyun Jeong
Seyun Bae
Yeonsung Jung
Jaeryong Hwang
Eunho Yang
AAML
43
1
0
26 Mar 2025
MIRAGE: Multimodal Immersive Reasoning and Guided Exploration for Red-Team Jailbreak Attacks
Wenhao You
Bryan Hooi
Yiwei Wang
Yansen Wang
Zong Ke
Ming Yang
Zi Huang
Yujun Cai
AAML
61
0
0
24 Mar 2025
PM4Bench: A Parallel Multilingual Multi-Modal Multi-task Benchmark for Large Vision Language Model
Junyuan Gao
Jiahe Song
J. Wu
Runchuan Zhu
Guanlin Shen
...
Weijia Li
Bin Wang
Dahua Lin
Lijun Wu
Conghui He
92
0
0
24 Mar 2025
REVAL: A Comprehension Evaluation on Reliability and Values of Large Vision-Language Models
Jie M. Zhang
Zheng Yuan
Ziyi Wang
Bei Yan
Sibo Wang
Xiangkui Cao
Zonghui Guo
Shiguang Shan
Xilin Chen
ELM
47
0
0
20 Mar 2025
Survey of Adversarial Robustness in Multimodal Large Language Models
Chengze Jiang
Zhuangzhuang Wang
Minjing Dong
Jie Gui
AAML
63
0
0
18 Mar 2025
Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Hao-Ran Cheng
Erjia Xiao
Yichi Wang
Kaidi Xu
Mengshu Sun
Jindong Gu
Renjing Xu
41
0
0
14 Mar 2025
Tit-for-Tat: Safeguarding Large Vision-Language Models Against Jailbreak Attacks via Adversarial Defense
Shuyang Hao
Yijiao Wang
Bryan Hooi
Ming Yang
Jiaheng Liu
Chengcheng Tang
Zi Huang
Yujun Cai
AAML
54
0
0
14 Mar 2025
Making Every Step Effective: Jailbreaking Large Vision-Language Models Through Hierarchical KV Equalization
Shuyang Hao
Yiwei Wang
Bryan Hooi
Jiaheng Liu
Muhao Chen
Zi Huang
Yujun Cai
AAML
VLM
67
0
0
14 Mar 2025
ExtremeAIGC: Benchmarking LMM Vulnerability to AI-Generated Extremist Content
Bhavik Chandna
Mariam Aboujenane
Usman Naseem
60
0
0
13 Mar 2025
Utilizing Jailbreak Probability to Attack and Safeguard Multimodal LLMs
Wenzhuo Xu
Zhipeng Wei
Xiongtao Sun
Deyue Zhang
Dongdong Yang
Quanchen Zou
Xinming Zhang
AAML
52
0
0
10 Mar 2025
CeTAD: Towards Certified Toxicity-Aware Distance in Vision Language Models
Xiangyu Yin
Jiaxu Liu
Zhen Chen
Jinwei Hu
Yi Dong
Xiaowei Huang
Wenjie Ruan
AAML
50
0
0
08 Mar 2025
Adversarial Training for Multimodal Large Language Models against Jailbreak Attacks
Liming Lu
Shuchao Pang
Siyuan Liang
Haotian Zhu
Xiyu Zeng
Aishan Liu
Yunhuai Liu
Yongbin Zhou
AAML
51
2
0
05 Mar 2025
FC-Attack: Jailbreaking Large Vision-Language Models via Auto-Generated Flowcharts
Ziyi Zhang
Zhen Sun
Zhe Zhang
Jihui Guo
Xinlei He
AAML
55
2
0
28 Feb 2025
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings
Weikai Lu
Hao Peng
Huiping Zhuang
Cen Chen
Ziqian Zeng
0
0
0
18 Feb 2025
SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning
Junkai Chen
Zhijie Deng
Kening Zheng
Yibo Yan
Shuliang Liu
PeiJun Wu
Peijie Jiang
Jiaheng Liu
Xuming Hu
MU
69
3
0
18 Feb 2025
Understanding and Rectifying Safety Perception Distortion in VLMs
Xiaohan Zou
Jian Kang
George Kesidis
Lu Lin
210
1
0
18 Feb 2025
Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
Fenghua Weng
Jian Lou
Jun Feng
Minlie Huang
Wenjie Wang
AAML
75
2
0
17 Feb 2025
Distraction is All You Need for Multimodal Large Language Model Jailbreaking
Zuopeng Yang
Jiluan Fan
Anli Yan
Erdun Gao
Xin Lin
Tao Li
Kanghua mo
Changyu Dong
AAML
77
1
0
15 Feb 2025
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
H. Malik
Fahad Shamshad
Muzammal Naseer
Karthik Nandakumar
Fahad Shahbaz Khan
Salman Khan
AAML
MLLM
VLM
68
0
0
03 Feb 2025
Towards Robust Multimodal Large Language Models Against Jailbreak Attacks
Ziyi Yin
Yuanpu Cao
Han Liu
Ting Wang
Jinghui Chen
Fenhlong Ma
AAML
55
0
0
02 Feb 2025
Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models
Shuyang Hao
Bryan Hooi
Jiaheng Liu
Kai-Wei Chang
Zi Huang
Yujun Cai
AAML
92
1
0
27 Nov 2024
Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks
Peng Xie
Yequan Bie
Jianda Mao
Yangqiu Song
Yang Wang
Hao Chen
Kani Chen
AAML
74
1
0
24 Nov 2024
SoK: Unifying Cybersecurity and Cybersafety of Multimodal Foundation Models with an Information Theory Approach
Ruoxi Sun
Jiamin Chang
Hammond Pearce
Chaowei Xiao
B. Li
Qi Wu
Surya Nepal
Minhui Xue
44
0
0
17 Nov 2024
Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Xuannan Liu
Xing Cui
Peipei Li
Zekun Li
Huaibo Huang
Shuhan Xia
Miaoxuan Zhang
Yueying Zou
Ran He
AAML
67
8
0
14 Nov 2024
Unfair Alignment: Examining Safety Alignment Across Vision Encoder Layers in Vision-Language Models
Saketh Bachu
Erfan Shayegani
Trishna Chakraborty
Rohit Lal
Arindam Dutta
Chengyu Song
Yue Dong
Nael B. Abu-Ghazaleh
A. Roy-Chowdhury
36
0
0
06 Nov 2024
Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios
Yunkai Dang
Mengxi Gao
Yibo Yan
Xin Zou
Yanggan Gu
Aiwei Liu
Xuming Hu
44
4
0
05 Nov 2024
Audio Is the Achilles' Heel: Red Teaming Audio Large Multimodal Models
Hao Yang
Lizhen Qu
Ehsan Shareghi
Gholamreza Haffari
AAML
36
3
0
31 Oct 2024
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks
Yunhan Zhao
Xiang Zheng
Lin Luo
Yige Li
Xingjun Ma
Yu-Gang Jiang
VLM
AAML
62
3
0
28 Oct 2024
CLEAR: Character Unlearning in Textual and Visual Modalities
Alexey Dontsov
Dmitrii Korzh
Alexey Zhavoronkin
Boris Mikheev
Denis Bobkov
Aibek Alanov
Oleg Y. Rogov
Ivan Oseledets
Elena Tutubalina
AILaw
VLM
MU
68
5
0
23 Oct 2024
Insights and Current Gaps in Open-Source LLM Vulnerability Scanners: A Comparative Analysis
Jonathan Brokman
Omer Hofman
Oren Rachmil
Inderjeet Singh
Vikas Pahuja
Rathina Sabapathy Aishvariya Priya
Amit Giloni
Roman Vainshtein
Hisashi Kojima
36
2
0
21 Oct 2024
1
2
3
Next