Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.13507
Cited By
FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMs
20 March 2024
Jinmin Li
Kuofeng Gao
Yang Bai
Jingyun Zhang
Shu-Tao Xia
Yisen Wang
AAML
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMs"
12 / 12 papers shown
Title
Survey of Adversarial Robustness in Multimodal Large Language Models
Chengze Jiang
Zhuangzhuang Wang
Minjing Dong
Jie Gui
AAML
63
0
0
18 Mar 2025
Image-based Multimodal Models as Intruders: Transferable Multimodal Attacks on Video-based MLLMs
Linhao Huang
Xue Jiang
Zhiqiang Wang
Wentao Mo
Xi Xiao
Bo Han
Yongjie Yin
Feng Zheng
AAML
53
2
0
02 Jan 2025
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs
Jinmin Li
Kuofeng Gao
Yang Bai
Jingyun Zhang
Shu-Tao Xia
48
4
0
02 Jul 2024
Invertible Residual Rescaling Models
Jinmin Li
Tao Dai
Yaohua Zha
Yilu Luo
Longfei Lu
Bin Chen
Zhi Wang
Shu-Tao Xia
Jingyun Zhang
SupR
42
3
0
05 May 2024
Boundary-aware Decoupled Flow Networks for Realistic Extreme Rescaling
Jinmin Li
Tao Dai
Jingyun Zhang
Kang Liu
Jun Wang
Shaoming Wang
Shu-Tao Xia
Rizen Guo
39
2
0
05 May 2024
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen
Aliaksandr Siarohin
Willi Menapace
Ekaterina Deyneka
Hsiang-wei Chao
...
Yuwei Fang
Hsin-Ying Lee
Jian Ren
Ming-Hsuan Yang
Sergey Tulyakov
VGen
89
178
0
29 Feb 2024
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
Yongshuo Zong
Ondrej Bohdal
Tingyang Yu
Yongxin Yang
Timothy M. Hospedales
VLM
MLLM
57
57
0
03 Feb 2024
On the Multi-modal Vulnerability of Diffusion Models
Dingcheng Yang
Yang Bai
Xiaojun Jia
Yang Liu
Xiaochun Cao
Wenjian Yu
41
11
0
02 Feb 2024
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
129
118
0
09 Nov 2023
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval
Huaishao Luo
Lei Ji
Ming Zhong
Yang Chen
Wen Lei
Nan Duan
Tianrui Li
CLIP
VLM
323
780
0
18 Apr 2021
Multi-modal Transformer for Video Retrieval
Valentin Gabeur
Chen Sun
Alahari Karteek
Cordelia Schmid
ViT
424
596
0
21 Jul 2020
Confidence Trigger Detection: Accelerating Real-time Tracking-by-detection Systems
Zhicheng Ding
Zhixin Lai
Siyang Li
Panfeng Li
Qikai Yang
E. Wong
38
23
0
02 Feb 2019
1