Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.11455
Cited By
Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training
17 February 2025
Fenghua Weng
Jian Lou
Jun Feng
Minlie Huang
Wenjie Wang
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adversary-Aware DPO: Enhancing Safety Alignment in Vision Language Models via Adversarial Training"
2 / 2 papers shown
Title
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Y. Liu
Shengfang Zhai
Mingzhe Du
Y. Chen
Tri Cao
...
X. Li
Kun Wang
Junfeng Fang
Jiaheng Zhang
Bryan Hooi
OffRL
LRM
7
0
0
16 May 2025
Misaligned Roles, Misplaced Images: Structural Input Perturbations Expose Multimodal Alignment Blind Spots
Erfan Shayegani
G M Shahariar
Sara Abdali
Lei Yu
Nael B. Abu-Ghazaleh
Yue Dong
AAML
78
0
0
01 Apr 2025
1