Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.15817
Cited By
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
25 September 2023
Yangjun Ruan
Honghua Dong
Andrew Wang
Silviu Pitis
Yongchao Zhou
Jimmy Ba
Yann Dubois
Chris J. Maddison
Tatsunori Hashimoto
LLMAG
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Identifying the Risks of LM Agents with an LM-Emulated Sandbox"
50 / 79 papers shown
Title
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
Ada Chen
Yongjiang Wu
Jingyang Zhang
Shu Yang
Jen-tse Huang
Kun Wang
Wenxuan Wang
Shuai Wang
ELM
12
0
0
16 May 2025
WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks
Ivan Evtimov
Arman Zharmagambetov
Aaron Grattafiori
Chuan Guo
Kamalika Chaudhuri
AAML
35
0
0
22 Apr 2025
On the Robustness of GUI Grounding Models Against Image Attacks
Haoren Zhao
Tianyi Chen
Zhen Wang
AAML
36
0
0
07 Apr 2025
Exploiting Fine-Grained Skip Behaviors for Micro-Video Recommendation
Sanghyuck Lee
Sangkeun Park
Jaesung Lee
53
0
0
04 Apr 2025
SandboxEval: Towards Securing Test Environment for Untrusted Code
Rafiqul Rabin
Jesse Hostetler
Sean McGregor
Brett Weir
Nick Judd
ELM
39
0
0
27 Mar 2025
StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs
Zhicheng Guo
Sijie Cheng
Yuchen Niu
Hao Wang
Sicheng Zhou
Wenbing Huang
Yang Liu
CLL
OffRL
88
0
0
26 Mar 2025
AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents
Haoyu Wang
Christopher M. Poskitt
Jun Sun
39
0
0
24 Mar 2025
Survey on Evaluation of LLM-based Agents
Asaf Yehudai
Lilach Eden
Alan Li
Guy Uziel
Yilun Zhao
Roy Bar-Haim
Arman Cohan
Michal Shmueli-Scheuer
LLMAG
ELM
Presented at
ResearchTrend Connect | LLMAG
on
07 May 2025
95
7
0
20 Mar 2025
AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents
Arman Zharmagambetov
Chuan Guo
Ivan Evtimov
Maya Pavlova
Ruslan Salakhutdinov
Kamalika Chaudhuri
70
1
0
12 Mar 2025
RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code
Dhruv Gautam
Spandan Garg
Jinu Jang
Neel Sundaresan
Roshanak Zilouchian Moghaddam
LLMAG
LRM
75
2
0
10 Mar 2025
Research on Superalignment Should Advance Now with Parallel Optimization of Competence and Conformity
HyunJin Kim
Xiaoyuan Yi
Jing Yao
Muhua Huang
Jinyeong Bak
James Evans
Xing Xie
44
0
0
08 Mar 2025
SafeArena: Evaluating the Safety of Autonomous Web Agents
Ada Defne Tur
Nicholas Meade
Xing Han Lù
Alejandra Zambrano
Arkil Patel
Esin Durmus
Spandana Gella
Karolina Stañczak
Siva Reddy
LLMAG
ELM
87
2
0
06 Mar 2025
UDora: A Unified Red Teaming Framework against LLM Agents by Dynamically Hijacking Their Own Reasoning
Junzhe Zhang
Shuang Yang
B. Li
AAML
LLMAG
58
0
0
28 Feb 2025
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System
Ziyou Jiang
Mingyang Li
Guowei Yang
Junjie Wang
Yuekai Huang
Zhiyuan Chang
Qing Wang
AAML
54
1
0
17 Feb 2025
AgentGuard: Repurposing Agentic Orchestrator for Safety Evaluation of Tool Orchestration
Jizhou Chen
Samuel Lee Cong
LLMAG
38
2
0
13 Feb 2025
The AI Agent Index
Stephen Casper
Luke Bailey
Rosco Hunter
Carson Ezell
Emma Cabalé
...
Phillip J. K. Christoffersen
A. Pinar Ozisik
Rakshit Trivedi
Dylan Hadfield-Menell
Noam Kolt
80
5
0
03 Feb 2025
Episodic memory in AI agents poses risks that should be studied and mitigated
Chad DeChant
64
2
0
20 Jan 2025
Deploying Foundation Model Powered Agent Services: A Survey
Wenchao Xu
Jinyu Chen
Peirong Zheng
Xiaoquan Yi
Tianyi Tian
...
Quan Wan
Yining Qi
Yunfeng Fan
Qinliang Su
Xuemin Shen
AI4CE
119
1
0
18 Dec 2024
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Jiale Cheng
Xiao-Chang Liu
C. Wang
Xiaotao Gu
Yaojie Lu
Dan Zhang
Yuxiao Dong
J. Tang
Hongning Wang
Minlie Huang
LRM
126
3
0
16 Dec 2024
Attacking Vision-Language Computer Agents via Pop-ups
Yanzhe Zhang
Tao Yu
Diyi Yang
AAML
VLM
35
20
0
04 Nov 2024
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments
Kung-Hsiang Huang
Akshara Prabhakar
Sidharth Dhawan
Yixin Mao
Huan Wang
Silvio Savarese
Caiming Xiong
Philippe Laban
C. Wu
44
7
0
04 Nov 2024
MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Juyong Lee
Dongyoon Hahm
June Suk Choi
W. Bradley Knox
Kimin Lee
LLMAG
ELM
AAML
LM&Ro
43
2
0
23 Oct 2024
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In
Itay Nakash
George Kour
Guy Uziel
Ateret Anaby-Tavor
AAML
LLMAG
40
4
0
22 Oct 2024
DAWN: Designing Distributed Agents in a Worldwide Network
Zahra Aminiranjbar
Jianan Tang
Qiudan Wang
Shubha Pant
Mahesh Viswanathan
LLMAG
AI4CE
26
2
0
11 Oct 2024
From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts
Zhuohao Jerry Zhang
E. Schoop
Jeffrey Nichols
Anuj Mahajan
Amanda Swearngin
LLMAG
31
0
0
11 Oct 2024
Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents
Priyanshu Kumar
Elaine Lau
Saranya Vijayakumar
Tu Trinh
Scale Red Team
...
Sean Hendryx
Shuyan Zhou
Matt Fredrikson
Summer Yue
Zifan Wang
LLMAG
34
17
0
11 Oct 2024
Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Jing Jiang
Min-Bin Lin
44
8
0
09 Oct 2024
Developing Assurance Cases for Adversarial Robustness and Regulatory Compliance in LLMs
Tomas Bueno Momcilovic
Dian Balta
Beat Buesser
Giulio Zizzo
Mark Purcell
AAML
28
0
0
04 Oct 2024
Holistic Automated Red Teaming for Large Language Models through Top-Down Test Case Generation and Multi-turn Interaction
Jinchuan Zhang
Yan Zhou
Yaxin Liu
Ziming Li
Songlin Hu
AAML
28
3
0
25 Sep 2024
Sparse Rewards Can Self-Train Dialogue Agents
B. Lattimer
Varun Gangal
Ryan McDonald
Yi Yang
LLMAG
29
2
0
06 Sep 2024
PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action
Yijia Shao
Tianshi Li
Weiyan Shi
Yanchen Liu
Diyi Yang
PILM
58
14
0
29 Aug 2024
Athena: Safe Autonomous Agents with Verbal Contrastive Learning
Tanmana Sadhu
Ali Pesaranghader
Yanan Chen
Dong Hoon Yi
ELM
LLMAG
AAML
31
0
0
20 Aug 2024
Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions
Xinbei Ma
Yiting Wang
Yao Yao
Tongxin Yuan
Aston Zhang
Zhuosheng Zhang
Hai Zhao
AAML
LLMAG
32
17
0
05 Aug 2024
Operationalizing Contextual Integrity in Privacy-Conscious Assistants
Sahra Ghalebikesabi
Eugene Bagdasaryan
Ren Yi
Itay Yona
Ilia Shumailov
...
Robert Stanforth
Leonard Berrada
Pushmeet Kohli
Po-Sen Huang
Borja Balle
32
8
0
05 Aug 2024
Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?
Richard Ren
Steven Basart
Adam Khoja
Alice Gatti
Long Phan
...
Alexander Pan
Gabriel Mukobi
Ryan H. Kim
Stephen Fitz
Dan Hendrycks
ELM
26
21
0
31 Jul 2024
Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification
Boyang Zhang
Yicong Tan
Yun Shen
Ahmed Salem
Michael Backes
Savvas Zannettou
Yang Zhang
LLMAG
AAML
44
14
0
30 Jul 2024
The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies
Feng He
Tianqing Zhu
Dayong Ye
Bo Liu
Wanlei Zhou
Philip S. Yu
PILM
LLMAG
ELM
68
24
0
28 Jul 2024
What Affects the Stability of Tool Learning? An Empirical Study on the Robustness of Tool Learning Frameworks
Chengrui Huang
Zhengliang Shi
Yuntao Wen
Xiuying Chen
Peng Han
Shen Gao
Shuo Shang
39
1
0
03 Jul 2024
AI Agents That Matter
Sayash Kapoor
Benedikt Stroebl
Zachary S. Siegel
Nitya Nadgir
Arvind Narayanan
49
36
0
01 Jul 2024
AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models
Jiale Cheng
Yida Lu
Xiaotao Gu
Pei Ke
Xiao-Yang Liu
Yuxiao Dong
Hongning Wang
Jie Tang
Minlie Huang
37
4
0
24 Jun 2024
AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Edoardo Debenedetti
Jie Zhang
Mislav Balunović
Luca Beurer-Kellner
Marc Fischer
Florian Tramèr
LLMAG
AAML
56
26
1
19 Jun 2024
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts
Honghua Dong
Qidong Su
Yubo Gao
Zhaoyu Li
Yangjun Ruan
Gennady Pekhimenko
Chris J. Maddison
Xujie Si
LLMAG
34
1
0
19 Jun 2024
τ
τ
τ
-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
Shunyu Yao
Noah Shinn
P. Razavi
Karthik Narasimhan
ALM
41
55
0
17 Jun 2024
VillagerAgent: A Graph-Based Multi-Agent Framework for Coordinating Complex Task Dependencies in Minecraft
Yubo Dong
Xukun Zhu
Zhengzhe Pan
Linchao Zhu
Yi Yang
38
11
0
09 Jun 2024
A Survey of Language-Based Communication in Robotics
William Hunt
Sarvapali D. Ramchurn
Mohammad D. Soorati
LM&Ro
65
12
0
06 Jun 2024
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways
Zehang Deng
Yongjian Guo
Changzhou Han
Wanlun Ma
Junwu Xiong
Sheng Wen
Yang Xiang
44
23
0
04 Jun 2024
Safeguarding Large Language Models: A Survey
Yi Dong
Ronghui Mu
Yanghao Zhang
Siqi Sun
Tianle Zhang
...
Yi Qi
Jinwei Hu
Jie Meng
Saddek Bensalem
Xiaowei Huang
OffRL
KELM
AILaw
35
19
0
03 Jun 2024
Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses
Xiaosen Zheng
Tianyu Pang
Chao Du
Qian Liu
Jing Jiang
Min-Bin Lin
AAML
68
29
0
03 Jun 2024
Tool Learning with Large Language Models: A Survey
Changle Qu
Sunhao Dai
Xiaochi Wei
Hengyi Cai
Shuaiqiang Wang
Dawei Yin
Jun Xu
Jirong Wen
LLMAG
31
80
0
28 May 2024
ALI-Agent: Assessing LLMs' Alignment with Human Values via Agent-based Evaluation
Jingnan Zheng
Han Wang
An Zhang
Tai D. Nguyen
Jun Sun
Tat-Seng Chua
LLMAG
40
14
0
23 May 2024
1
2
Next