SafeArena: Evaluating the Safety of Autonomous Web Agents

6 March 2025

Papers citing "SafeArena: Evaluating the Safety of Autonomous Web Agents"

14 / 14 papers shown

Title
DoomArena: A framework for Testing AI Agents Against Evolving Security Threats Léo Boisvert Mihir Bansal Chandra Kiran Reddy Evuru Gabriel Huang Abhay Puri ... Quentin Cappart Jason Stanley Alexandre Lacoste Alexandre Drouin Krishnamurthy Dvijotham 127 3 0 18 Apr 2025
The BrowserGym Ecosystem for Web Agent Research Thibault Le Sellier De Chezelles Maxime Gasse Alexandre Lacoste Alexandre Drouin Massimo Caccia ... Siva Reddy Quentin Cappart Graham Neubig Ruslan Salakhutdinov Nicolas Chapados LLMAG 164 18 0 06 Dec 2024
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness in Web Agents Ido Levy Ben Wiesel Sami Marreed Alon Oved Avi Yaeli Segev Shlomov LLMAG 122 23 0 09 Oct 2024
The Dark Side of Function Calling: Pathways to Jailbreaking Large Language Models Zihui Wu Haichang Gao Jianping He Ping Wang 84 10 0 25 Jul 2024
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs Seungju Han Kavel Rao Allyson Ettinger Liwei Jiang Bill Yuchen Lin Nathan Lambert Yejin Choi Nouha Dziri 116 100 0 26 Jun 2024
WebCanvas: Benchmarking Web Agents in Online Environments Yichen Pan Dehan Kong Sida Zhou Cheng Cui Yifei Leng ... Hangyu Liu Yanyi Shang Shuyan Zhou Tongshuang Wu Zhengyang Wu 111 43 0 18 Jun 2024
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack M. Russinovich Ahmed Salem Ronen Eldan 112 98 0 02 Apr 2024
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents Qiusi Zhan Zhixiang Liang Zifan Ying Daniel Kang LLMAG 120 102 0 05 Mar 2024
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue Xing Han Lù Zdeněk Kasner Siva Reddy 80 77 0 08 Feb 2024
SteP: Stacked LLM Policies for Web Actions Paloma Sodhi S. Branavan Yoav Artzi Ryan McDonald LLMAG 104 33 0 05 Oct 2023
Universal and Transferable Adversarial Attacks on Aligned Language Models Andy Zou Zifan Wang Nicholas Carlini Milad Nasr J. Zico Kolter Matt Fredrikson 293 1,518 0 27 Jul 2023
WebArena: A Realistic Web Environment for Building Autonomous Agents Shuyan Zhou Frank F. Xu Hao Zhu Xuhui Zhou Robert Lo ... Tianyue Ou Yonatan Bisk Daniel Fried Uri Alon Graham Neubig LLMAG 178 492 0 25 Jul 2023
From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces Peter Shaw Mandar Joshi James Cohan Jonathan Berant Panupong Pasupat Hexiang Hu Urvashi Khandelwal Kenton Lee Kristina Toutanova LLMAG LM&Ro 87 58 0 31 May 2023
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 880 42,379 0 28 May 2020