Prompt Injection Attack to Tool Selection in LLM Agents

28 April 2025

Abstract

Tool selection is a key component of LLM agents. The process operates through a two-step mechanism - \emph{retrieval} and \emph{selection} - to pick the most appropriate tool from a tool library for a given task. In this work, we introduce \textit{ToolHijacker}, a novel prompt injection attack targeting tool selection in no-box scenarios. ToolHijacker injects a malicious tool document into the tool library to manipulate the LLM agent's tool selection process, compelling it to consistently choose the attacker's malicious tool for an attacker-chosen target task. Specifically, we formulate the crafting of such tool documents as an optimization problem and propose a two-phase optimization strategy to solve it. Our extensive experimental evaluation shows that ToolHijacker is highly effective, significantly outperforming existing manual-based and automated prompt injection attacks when applied to tool selection. Moreover, we explore various defenses, including prevention-based defenses (StruQ and SecAlign) and detection-based defenses (known-answer detection, perplexity detection, and perplexity windowed detection). Our experimental results indicate that these defenses are insufficient, highlighting the urgent need for developing new defense strategies.

View on arXiv

@article{shi2025_2504.19793,
  title={ Prompt Injection Attack to Tool Selection in LLM Agents },
  author={ Jiawen Shi and Zenghui Yuan and Guiyao Tie and Pan Zhou and Neil Zhenqiang Gong and Lichao Sun },
  journal={arXiv preprint arXiv:2504.19793},
  year={ 2025 }
}

Comments on this paper