Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction
Amplification

Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification

30 July 2024

Michael Backes

Savvas Zannettou

Papers citing "Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification"

12 / 12 papers shown

Title
TRAIL: Trace Reasoning and Agentic Issue Localization Darshan Deshpande Varun Gangal Hersh Mehta Jitin Krishnan Anand Kannappan Rebecca Qian 32 0 0 13 May 2025
Security of Internet of Agents: Attacks and Countermeasures Yuntao Wang Yanghe Pan Shaolong Guo Zhou Su LLMAG 44 0 0 12 May 2025
Safeguard-by-Development: A Privacy-Enhanced Development Paradigm for Multi-Agent Collaboration Systems Jian Cui Zichuan Li Luyi Xing Xiaojing Liao 29 0 0 07 May 2025
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning Z. Chen Mintong Kang Bo-wen Li AAML 44 3 0 26 Mar 2025
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents Haoyu Wang Christopher M. Poskitt Jun Sun 44 0 0 24 Mar 2025
Multi-Agent Systems Execute Arbitrary Malicious Code Harold Triedman Rishi Jha Vitaly Shmatikov LLMAG AAML 101 2 0 15 Mar 2025
Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks Ang Li Yin Zhou Vethavikashini Chithrra Raghuram Tom Goldstein Micah Goldblum AAML 86 8 0 12 Feb 2025
On the Privacy Risk of In-context Learning Haonan Duan Adam Dziedzic Mohammad Yaghini Nicolas Papernot Franziska Boenisch SILM PILM 71 36 0 15 Nov 2024
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts Jiahao Yu Xingwei Lin Zheng Yu Xinyu Xing SILM 119 307 0 19 Sep 2023
Generative Agents: Interactive Simulacra of Human Behavior J. Park Joseph C. O'Brien Carrie J. Cai Meredith Ringel Morris Percy Liang Michael S. Bernstein LM&Ro AI4CE 244 1,764 0 07 Apr 2023
Gradient-based Adversarial Attacks against Text Transformers Chuan Guo Alexandre Sablayrolles Hervé Jégou Douwe Kiela SILM 106 228 0 15 Apr 2021
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks Mohit Iyyer John Wieting Kevin Gimpel Luke Zettlemoyer AAML GAN 205 713 0 17 Apr 2018