Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.14720
Cited By
Defending Against Indirect Prompt Injection Attacks With Spotlighting
20 March 2024
Keegan Hines
Gary Lopez
Matthew Hall
Federico Zarfati
Yonatan Zunger
Emre Kiciman
AAML
SILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Defending Against Indirect Prompt Injection Attacks With Spotlighting"
29 / 29 papers shown
Title
A Survey of Attacks on Large Language Models
Wenrui Xu
Keshab K. Parhi
AAML
ELM
0
0
0
18 May 2025
ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks
Zhixiong Zhuang
Maria-Irina Nicolae
Hui-Po Wang
Mario Fritz
AAML
SILM
28
0
0
16 May 2025
Defending against Indirect Prompt Injection by Instruction Detection
Tongyu Wen
Chenglong Wang
Xiyuan Yang
Haoyu Tang
Yueqi Xie
Lingjuan Lyu
Zhicheng Dou
Fangzhao Wu
AAML
34
0
0
08 May 2025
CachePrune: Neural-Based Attribution Defense Against Indirect Prompt Injection Attacks
Rui Wang
Junda Wu
Yu Xia
Tong Yu
R. Zhang
Ryan Rossi
Lina Yao
Julian McAuley
AAML
SILM
51
0
0
29 Apr 2025
Robustness via Referencing: Defending against Prompt Injection Attacks by Referencing the Executed Instruction
Y. Chen
Haoran Li
Yuan Sui
Yi Liu
Yufei He
Yangqiu Song
Bryan Hooi
AAML
SILM
63
0
0
29 Apr 2025
Exploiting Fine-Grained Skip Behaviors for Micro-Video Recommendation
Sanghyuck Lee
Sangkeun Park
Jaesung Lee
53
0
0
04 Apr 2025
Multi-Agent Systems Execute Arbitrary Malicious Code
Harold Triedman
Rishi Jha
Vitaly Shmatikov
LLMAG
AAML
96
2
0
15 Mar 2025
ASIDE: Architectural Separation of Instructions and Data in Language Models
Egor Zverev
Evgenii Kortukov
Alexander Panfilov
Soroush Tabesh
Alexandra Volkova
Sebastian Lapuschkin
Wojciech Samek
Christoph H. Lampert
AAML
54
1
0
13 Mar 2025
Can Indirect Prompt Injection Attacks Be Detected and Removed?
Yulin Chen
Haoran Li
Yuan Sui
Yufei He
Yue Liu
Yangqiu Song
Bryan Hooi
AAML
44
3
0
23 Feb 2025
Control Illusion: The Failure of Instruction Hierarchies in Large Language Models
Yilin Geng
Hao Li
Honglin Mu
Xudong Han
Timothy Baldwin
Omri Abend
Eduard H. Hovy
Lea Frermann
41
2
0
21 Feb 2025
Lessons From Red Teaming 100 Generative AI Products
Blake Bullwinkel
Amanda Minnich
Shiven Chawla
Gary Lopez
Martin Pouliot
...
Pete Bryan
Ram Shankar Siva Kumar
Yonatan Zunger
Chang Kawaguchi
Mark Russinovich
AAML
VLM
37
5
0
13 Jan 2025
The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents
Feiran Jia
Tong Wu
Xin Qin
Anna Squicciarini
LLMAG
AAML
86
4
0
21 Dec 2024
Attention Tracker: Detecting Prompt Injection Attacks in LLMs
Kuo-Han Hung
Ching-Yun Ko
Ambrish Rawat
I-Hsin Chung
Winston H. Hsu
Pin-Yu Chen
49
7
0
01 Nov 2024
Defense Against Prompt Injection Attack by Leveraging Attack Techniques
Yulin Chen
Haoran Li
Zihao Zheng
Yangqiu Song
Dekai Wu
Bryan Hooi
SILM
AAML
50
4
0
01 Nov 2024
FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks
Jiongxiao Wang
Fangzhou Wu
Wendi Li
Jinsheng Pan
Edward Suh
Zhuoqing Mao
Muhao Chen
Chaowei Xiao
AAML
40
6
0
28 Oct 2024
Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems
Donghyun Lee
Mo Tiwari
LLMAG
36
9
0
09 Oct 2024
Recent advancements in LLM Red-Teaming: Techniques, Defenses, and Ethical Considerations
Tarun Raheja
Nilay Pochhi
AAML
51
1
0
09 Oct 2024
Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy
Tong Wu
Shujian Zhang
Kaiqiang Song
Silei Xu
Sanqiang Zhao
Ravi Agrawal
Sathish Indurthi
Chong Xiang
Prateek Mittal
Wenxuan Zhou
42
8
0
09 Oct 2024
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data
Xuefeng Du
Reshmi Ghosh
Robert Sim
Ahmed Salem
Vitor Carvalho
Emily Lawton
Yixuan Li
Jack W. Stokes
VLM
AAML
38
6
0
01 Oct 2024
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies
Feng He
Tianqing Zhu
Dayong Ye
Bo Liu
Wanlei Zhou
Philip S. Yu
PILM
LLMAG
ELM
68
24
0
28 Jul 2024
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
Apurv Verma
Satyapriya Krishna
Sebastian Gehrmann
Madhavan Seshadri
Anu Pradhan
Tom Ault
Leslie Barrett
David Rabinowitz
John Doucette
Nhathai Phan
54
10
0
20 Jul 2024
Soft Begging: Modular and Efficient Shielding of LLMs against Prompt Injection and Jailbreaking based on Prompt Tuning
Simon Ostermann
Kevin Baum
Christoph Endres
Julia Masloh
P. Schramowski
AAML
54
1
0
03 Jul 2024
Adversarial Search Engine Optimization for Large Language Models
Fredrik Nestaas
Edoardo Debenedetti
Florian Tramèr
AAML
40
4
0
26 Jun 2024
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Edoardo Debenedetti
Jie Zhang
Mislav Balunović
Luca Beurer-Kellner
Marc Fischer
Florian Tramèr
LLMAG
AAML
56
26
1
19 Jun 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
44
23
0
04 Jun 2024
BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Jiaqi Xue
Meng Zheng
Yebowen Hu
Fei Liu
Xun Chen
Qian Lou
AAML
SILM
35
25
0
03 Jun 2024
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Egor Zverev
Sahar Abdelnabi
Soroush Tabesh
Mario Fritz
Christoph H. Lampert
56
19
0
11 Mar 2024
Privacy in Large Language Models: Attacks, Defenses and Future Directions
Haoran Li
Yulin Chen
Jinglong Luo
Yan Kang
Xiaojin Zhang
Qi Hu
Chunkit Chan
Yangqiu Song
PILM
48
42
0
16 Oct 2023
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
382
8,495
0
28 Jan 2022
1