Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.02644
Cited By
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
3 October 2024
Hanrong Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAML
LLMAG
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents"
15 / 15 papers shown
Title
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
Ada Chen
Yongjiang Wu
Jingyang Zhang
Shu Yang
Jen-tse Huang
Kun Wang
Wenxuan Wang
Shuai Wang
ELM
17
0
0
16 May 2025
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models
Huining Cui
Wei Liu
AAML
ELM
30
0
0
12 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xuzhao Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
98
2
0
26 Apr 2025
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov
Arman Zharmagambetov
Aaron Grattafiori
Chuan Guo
Kamalika Chaudhuri
AAML
35
1
0
22 Apr 2025
Planet as a Brain: Towards Internet of AgentSites based on AIOS Server
Xiang Zhang
Yongfeng Zhang
44
0
0
19 Apr 2025
Progent: Programmable Privilege Control for LLM Agents
Tianneng Shi
Jingxuan He
Zhun Wang
Linyu Wu
Hongwei Li
Wenbo Guo
Dawn Song
LLMAG
44
0
0
16 Apr 2025
Emerging Cyber Attack Risks of Medical AI Agents
Jianing Qiu
Lin Li
Jiankai Sun
Hao Wei
Zhe Xu
K. Lam
Wu Yuan
AAML
35
2
0
02 Apr 2025
Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems
Shiyi Yang
Zhibo Hu
Chen Wang
Tong Yu
Xiwei Xu
Liming Zhu
Lina Yao
AAML
47
0
0
31 Mar 2025
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning
Z. Chen
Mintong Kang
Bo-wen Li
AAML
44
3
0
26 Mar 2025
Multi-Agent Systems Execute Arbitrary Malicious Code
Harold Triedman
Rishi Jha
Vitaly Shmatikov
LLMAG
AAML
104
2
0
15 Mar 2025
Cerebrum (AIOS SDK): A Platform for Agent Development, Deployment, Distribution, and Discovery
Balaji Rama
Kai Mei
Yongfeng Zhang
LLMAG
60
1
0
14 Mar 2025
Towards Action Hijacking of Large Language Model-based Agent
Yuyang Zhang
Kangjie Chen
Xudong Jiang
Yuxiang Sun
Run Wang
Lina Wang
LLMAG
AAML
75
3
0
14 Dec 2024
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
87
1
0
09 Oct 2024
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents
Ido Levy
Ben Wiesel
Sami Marreed
Alon Oved
Avi Yaeli
Segev Shlomov
LLMAG
39
15
0
09 Oct 2024
Data-centric NLP Backdoor Defense from the Lens of Memorization
Zhenting Wang
Zhizhi Wang
Mingyu Jin
Mengnan Du
Juan Zhai
Shiqing Ma
35
3
0
21 Sep 2024
1