ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.14348
  4. Cited By
Manipulating Multimodal Agents via Cross-Modal Prompt Injection
v1v2v3 (latest)

Manipulating Multimodal Agents via Cross-Modal Prompt Injection

19 April 2025
Le Wang
Zonghao Ying
Tianyuan Zhang
Siyuan Liang
Shengshan Hu
Mingchuan Zhang
A. Liu
Xianglong Liu
    AAML
ArXiv (abs)PDFHTML

Papers citing "Manipulating Multimodal Agents via Cross-Modal Prompt Injection"

50 / 69 papers shown
Title
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Red Teaming the Mind of the Machine: A Systematic Evaluation of Prompt Injection and Jailbreak Vulnerabilities in LLMs
Chetan Pathade
AAMLSILM
205
2
0
07 May 2025
Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings
Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings
Zonghao Ying
Guangyi Zheng
Yongxin Huang
Deyue Zhang
Wenxin Zhang
Quanchen Zou
Aishan Liu
Xianglong Liu
Dacheng Tao
ELM
142
13
0
19 Mar 2025
Attacking Multimodal OS Agents with Malicious Image Patches
Lukas Aichberger
Alasdair Paren
Y. Gal
Philip Torr
Adel Bibi
AAML
110
5
0
13 Mar 2025
BFA: Best-Feature-Aware Fusion for Multi-View Fine-grained Manipulation
BFA: Best-Feature-Aware Fusion for Multi-View Fine-grained Manipulation
Zihan Lan
Weixin Mao
Haoyang Li
Le Wang
Tiancai Wang
Haoqiang Fan
Osamu Yoshie
EgoV
119
2
0
20 Feb 2025
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
AAML
150
82
0
28 Jan 2025
VLM-AD: End-to-End Autonomous Driving through Vision-Language Model
  Supervision
VLM-AD: End-to-End Autonomous Driving through Vision-Language Model Supervision
Yi Xu
Yuxin Hu
Zaiwei Zhang
Gregory P. Meyer
Siva Karthik Mustikovela
Siddhartha Srinivasa
Eric M. Wolff
Xin Huang
VLMLRM
96
22
0
19 Dec 2024
Visual Adversarial Attack on Vision-Language Models for Autonomous
  Driving
Visual Adversarial Attack on Vision-Language Models for Autonomous Driving
Tianyuan Zhang
Lu Wang
Xinwei Zhang
Yize Zhang
Boyi Jia
Siyuan Liang
Shengshan Hu
Qiang Fu
Aishan Liu
Xianglong Liu
VLMAAML
96
6
0
27 Nov 2024
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats
  in LLM-Based Agents
Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents
Yuyou Gan
Yong Yang
Zhe Ma
Ping He
Rui Zeng
...
Songze Li
Ting Wang
Yunjun Gao
Yingcai Wu
Shouling Ji
PILMLLMAG
57
12
0
14 Nov 2024
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance
  Mathematical Reasoning
Math-PUMA: Progressive Upward Multimodal Alignment to Enhance Mathematical Reasoning
Wenwen Zhuang
Xin Huang
Xiantao Zhang
Jin Zeng
LRM
115
31
0
16 Aug 2024
Empirical Analysis of Large Vision-Language Models against Goal
  Hijacking via Visual Prompt Injection
Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection
Subaru Kimura
Ryota Tanaka
Shumpei Miyawaki
Jun Suzuki
Keisuke Sakaguchi
MLLM
80
7
0
07 Aug 2024
Compromising Embodied Agents with Contextual Backdoor Attacks
Compromising Embodied Agents with Contextual Backdoor Attacks
Aishan Liu
Yuguang Zhou
Xianglong Liu
Tianyuan Zhang
Siyuan Liang
...
Tianlin Li
Junqi Zhang
Wenbo Zhou
Qing Guo
Dacheng Tao
LLMAGAAML
97
13
0
06 Aug 2024
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
MiniCPM-V: A GPT-4V Level MLLM on Your Phone
Yuan Yao
Tianyu Yu
Ao Zhang
Chongyi Wang
Junbo Cui
...
Xu Han
Guoyang Zeng
Dahai Li
Zhiyuan Liu
Maosong Sun
VLMMLLM
120
478
0
03 Aug 2024
DF40: Toward Next-Generation Deepfake Detection
DF40: Toward Next-Generation Deepfake Detection
Zhiyuan Yan
Taiping Yao
Shen Chen
Yandan Zhao
Xinghe Fu
...
Donghao Luo
Li Yuan
Chengjie Wang
Shouhong Ding
Yunsheng Wu
92
39
0
19 Jun 2024
DLP: towards active defense against backdoor attacks with decoupled
  learning process
DLP: towards active defense against backdoor attacks with decoupled learning process
Zonghao Ying
Bin Wu
AAML
114
10
0
18 Jun 2024
NBA: defensive distillation for backdoor removal via neural behavior
  alignment
NBA: defensive distillation for backdoor removal via neural behavior alignment
Zonghao Ying
Bin Wu
AAML
46
10
0
16 Jun 2024
OpenVLA: An Open-Source Vision-Language-Action Model
OpenVLA: An Open-Source Vision-Language-Action Model
Moo Jin Kim
Karl Pertsch
Siddharth Karamcheti
Ted Xiao
Ashwin Balakrishna
...
Russ Tedrake
Dorsa Sadigh
Sergey Levine
Percy Liang
Chelsea Finn
LM&RoVLM
257
533
0
13 Jun 2024
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak
  Attacks
Unveiling the Safety of GPT-4o: An Empirical Study using Jailbreak Attacks
Zonghao Ying
Aishan Liu
Xianglong Liu
Dacheng Tao
116
25
0
10 Jun 2024
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Jailbreak Vision Language Models via Bi-Modal Adversarial Prompt
Zonghao Ying
Aishan Liu
Tianyuan Zhang
Zhengmin Yu
Siyuan Liang
Xianglong Liu
Dacheng Tao
AAML
90
40
0
06 Jun 2024
Autonomous Workflow for Multimodal Fine-Grained Training Assistants
  Towards Mixed Reality
Autonomous Workflow for Multimodal Fine-Grained Training Assistants Towards Mixed Reality
Jiahuan Pei
Irene Viola
Haochen Huang
Junxiao Wang
Moonisa Ahsan
...
Yao Sai
Di Wang
Zhumin Chen
Pengjie Ren
Pablo César
LM&RoLLMAG
101
8
0
16 May 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your
  Phone
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Marah Abdin
Sam Ade Jacobs
A. A. Awan
J. Aneja
Ahmed Hassan Awadallah
...
Li Zhang
Yi Zhang
Yue Zhang
Yunan Zhang
Xiren Zhou
LRMALM
169
1,265
0
22 Apr 2024
The Instruction Hierarchy: Training LLMs to Prioritize Privileged
  Instructions
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Eric Wallace
Kai Y. Xiao
R. Leike
Lilian Weng
Johannes Heidecke
Alex Beutel
SILM
120
141
0
19 Apr 2024
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Optimization-based Prompt Injection Attack to LLM-as-a-Judge
Jiawen Shi
Zenghui Yuan
Yinuo Liu
Yue Huang
Pan Zhou
Lichao Sun
Neil Zhenqiang Gong
AAML
130
57
0
26 Mar 2024
Automatic and Universal Prompt Injection Attacks against Large Language
  Models
Automatic and Universal Prompt Injection Attacks against Large Language Models
Xiaogeng Liu
Zhiyuan Yu
Yizhe Zhang
Ning Zhang
Chaowei Xiao
SILMAAML
81
49
0
07 Mar 2024
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated
  Large Language Model Agents
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Qiusi Zhan
Zhixiang Liang
Zifan Ying
Daniel Kang
LLMAG
123
103
0
05 Mar 2024
Large Multimodal Agents: A Survey
Large Multimodal Agents: A Survey
Junlin Xie
Zhihong Chen
Ruifei Zhang
Xiang Wan
Guanbin Li
LM&RoLLMAG
90
44
0
23 Feb 2024
VL-Trojan: Multimodal Instruction Backdoor Attacks against
  Autoregressive Visual Language Models
VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models
Jiawei Liang
Siyuan Liang
Man Luo
Aishan Liu
Dongchen Han
Ee-Chien Chang
Xiaochun Cao
92
47
0
21 Feb 2024
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile
  Devices
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
Xiangxiang Chu
Limeng Qiao
Xinyang Lin
Shuang Xu
Yang Yang
...
Fei Wei
Xinyu Zhang
Bo Zhang
Xiaolin Wei
Chunhua Shen
MLLM
103
44
0
28 Dec 2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLMMLLM
262
1,216
0
21 Dec 2023
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
  Planning States for Autonomous Driving
DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving
Wenhai Wang
Jiangwei Xie
ChuanYang Hu
Haoming Zou
Jianan Fan
...
Lewei Lu
Xizhou Zhu
Xiaogang Wang
Yu Qiao
Jifeng Dai
85
146
0
14 Dec 2023
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active
  Perception
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
Yiran Qin
Enshen Zhou
Qichang Liu
Zhen-fei Yin
Lu Sheng
Ruimao Zhang
Yu Qiao
Jing Shao
LM&Ro
97
50
0
12 Dec 2023
On the Robustness of Large Multimodal Models Against Image Adversarial
  Attacks
On the Robustness of Large Multimodal Models Against Image Adversarial Attacks
Xuanimng Cui
Alejandro Aparcedo
Young Kyun Jang
Ser-Nam Lim
AAMLVLM
74
47
0
06 Dec 2023
Dolphins: Multimodal Language Model for Driving
Dolphins: Multimodal Language Model for Driving
Yingzi Ma
Yulong Cao
Jiachen Sun
Marco Pavone
Chaowei Xiao
MLLM
101
63
0
01 Dec 2023
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
  Learning
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning
Siyuan Liang
Mingli Zhu
Aishan Liu
Baoyuan Wu
Xiaochun Cao
Ee-Chien Chang
102
58
0
20 Nov 2023
Meta Prompting for AI Systems
Meta Prompting for AI Systems
Yifan Zhang
Yang Yuan
Andrew Chi-Chih Yao
LLMAGLRM
117
6
0
20 Nov 2023
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
224
160
0
09 Nov 2023
CogVLM: Visual Expert for Pretrained Language Models
CogVLM: Visual Expert for Pretrained Language Models
Weihan Wang
Qingsong Lv
Wenmeng Yu
Wenyi Hong
Ji Qi
...
Bin Xu
Juanzi Li
Yuxiao Dong
Ming Ding
Jie Tang
VLMMLLM
128
515
0
06 Nov 2023
Vision-Language Foundation Models as Effective Robot Imitators
Vision-Language Foundation Models as Effective Robot Imitators
Xinghang Li
Minghuan Liu
Hanbo Zhang
Cunjun Yu
Jie Xu
...
Ya Jing
Weinan Zhang
Huaping Liu
Hang Li
Tao Kong
LM&Ro
128
170
0
02 Nov 2023
Formalizing and Benchmarking Prompt Injection Attacks and Defenses
Formalizing and Benchmarking Prompt Injection Attacks and Defenses
Yupei Liu
Yuqi Jia
Runpeng Geng
Jinyuan Jia
Neil Zhenqiang Gong
SILMLLMAG
102
95
0
19 Oct 2023
Mistral 7B
Mistral 7B
Albert Q. Jiang
Alexandre Sablayrolles
A. Mensch
Chris Bamford
Devendra Singh Chaplot
...
Teven Le Scao
Thibaut Lavril
Thomas Wang
Timothée Lacroix
William El Sayed
MoELRM
113
2,246
0
10 Oct 2023
Improved Baselines with Visual Instruction Tuning
Improved Baselines with Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Yuheng Li
Yong Jae Lee
VLMMLLM
179
2,825
0
05 Oct 2023
Universal and Transferable Adversarial Attacks on Aligned Language
  Models
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou
Zifan Wang
Nicholas Carlini
Milad Nasr
J. Zico Kolter
Matt Fredrikson
295
1,518
0
27 Jul 2023
Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal
  Language Models
Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models
Erfan Shayegani
Yue Dong
Nael B. Abu-Ghazaleh
105
152
0
26 Jul 2023
Abusing Images and Sounds for Indirect Instruction Injection in
  Multi-Modal LLMs
Abusing Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs
Eugene Bagdasaryan
Tsung-Yin Hsieh
Ben Nassi
Vitaly Shmatikov
63
86
0
19 Jul 2023
Prompt Injection attack against LLM-integrated Applications
Prompt Injection attack against LLM-integrated Applications
Yi Liu
Gelei Deng
Yuekang Li
Kailong Wang
Zihao Wang
...
Tianwei Zhang
Yepang Liu
Haoyu Wang
Yanhong Zheng
Yang Liu
SILM
116
363
0
08 Jun 2023
On Evaluating Adversarial Robustness of Large Vision-Language Models
On Evaluating Adversarial Robustness of Large Vision-Language Models
Yunqing Zhao
Tianyu Pang
Chao Du
Xiao Yang
Chongxuan Li
Ngai-Man Cheung
Min Lin
VLMAAMLMLLM
133
180
0
26 May 2023
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
Yao Mu
Qinglong Zhang
Mengkang Hu
Wen Wang
Mingyu Ding
Jun Jin
Bin Wang
Jifeng Dai
Yu Qiao
Ping Luo
LM&RoLRM
96
245
0
24 May 2023
CoEdIT: Text Editing by Task-Specific Instruction Tuning
CoEdIT: Text Editing by Task-Specific Instruction Tuning
Vipul Raheja
Dhruv Kumar
Ryan Koo
Dongyeop Kang
ALM
95
60
0
17 May 2023
InstructBLIP: Towards General-purpose Vision-Language Models with
  Instruction Tuning
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai
Junnan Li
Dongxu Li
A. M. H. Tiong
Junqi Zhao
Weisheng Wang
Boyang Albert Li
Pascale Fung
Steven C. H. Hoi
MLLMVLM
148
2,098
0
11 May 2023
Chameleon: Plug-and-Play Compositional Reasoning with Large Language
  Models
Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Pan Lu
Baolin Peng
Hao Cheng
Michel Galley
Kai-Wei Chang
Ying Nian Wu
Song-Chun Zhu
Jianfeng Gao
KELMMLLMLRM
122
324
0
19 Apr 2023
Visual Instruction Tuning
Visual Instruction Tuning
Haotian Liu
Chunyuan Li
Qingyang Wu
Yong Jae Lee
SyDaVLMMLLM
571
4,925
0
17 Apr 2023
12
Next