Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.08144
Cited By
v1
v2 (latest)
LLM Agents can Autonomously Exploit One-day Vulnerabilities
11 April 2024
Richard Fang
R. Bindu
Akul Gupta
Daniel Kang
SILM
LLMAG
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"LLM Agents can Autonomously Exploit One-day Vulnerabilities"
19 / 19 papers shown
Title
Hierarchical Error Assessment of CAD Models for Aircraft Manufacturing-and-Measurement
Jin Huang
Honghua Chen
Mingqiang Wei
107
0
0
12 Jun 2025
VideoMat: Extracting PBR Materials from Video Diffusion Models
Jacob Munkberg
Zian Wang
Ruofan Liang
Tianchang Shen
J. Hasselgren
DiffM
VGen
109
6
0
11 Jun 2025
Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges
Lajos Muzsai
David Imolai
András Lukács
LLMAG
LRM
18
0
0
01 Jun 2025
DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments
Chiyu Zhang
Marc-Alexandre Cote
Michael Albada
Anush Sankaran
Jack W. Stokes
Tong Wang
Amir H. Abdi
William Blum
Muhammad Abdul-Mageed
LLMAG
AAML
ELM
51
0
0
31 May 2025
Dynamic Risk Assessments for Offensive Cybersecurity Agents
Boyi Wei
Benedikt Stroebl
Jiacen Xu
Joie Zhang
Zhou Li
Peter Henderson
82
0
0
23 May 2025
RedTeamLLM: an Agentic AI framework for offensive security
Brian Challita
Pierre Parrend
LLMAG
133
0
0
11 May 2025
Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design
A. Happe
Jürgen Cito
88
1
0
14 Apr 2025
Frontier AI's Impact on the Cybersecurity Landscape
Wenbo Guo
Yujin Potter
Tianneng Shi
Zhun Wang
Andy Zhang
Dawn Song
111
2
0
07 Apr 2025
CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities
Yuxuan Zhu
Antony Kellermann
Dylan Bowman
Philip Li
Akul Gupta
...
Avi Dhir
Sudhit Rao
Kaicheng Yu
Twm Stone
Daniel Kang
LLMAG
ELM
128
7
0
21 Mar 2025
Mapping AI Benchmark Data to Quantitative Risk Estimates Through Expert Elicitation
Malcolm Murray
Henry Papadatos
Otter Quarks
Pierre-François Gimenez
Simeon Campos
115
1
0
06 Mar 2025
Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements
I. Isozaki
Manil Shrestha
Rick Console
Edward Kim
ELM
120
7
0
24 Feb 2025
Construction and Evaluation of LLM-based agents for Semi-Autonomous penetration testing
Masaya Kobayashi
Masane Fuchi
Amar Zanashir
Tomonori Yoneda
Tomohiro Takagi
LLMAG
122
2
0
24 Feb 2025
Generative AI for Internet of Things Security: Challenges and Opportunities
Yan Lin Aung
Ivan Christian
Ye Dong
Xiaodong Ye
Sudipta Chattopadhyay
Jianying Zhou
100
1
0
13 Feb 2025
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
AAML
174
82
0
28 Jan 2025
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
Haolin Jin
Linghan Huang
Haipeng Cai
Jun Yan
Bo Li
Huaming Chen
158
37
0
05 Aug 2024
On the Limitations of Compute Thresholds as a Governance Strategy
Sara Hooker
122
19
0
08 Jul 2024
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Richard Fang
Antony Kellermann
Akul Gupta
Qiusi Zhan
Richard Fang
R. Bindu
Daniel Kang
LLMAG
103
36
0
02 Jun 2024
Societal Adaptation to Advanced AI
Jamie Bernardi
Gabriel Mukobi
Hilary Greaves
Lennart Heim
Markus Anderljung
111
7
0
16 May 2024
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation
Yaoxiang Wang
Zhiyong Wu
Junfeng Yao
Jinsong Su
LLMAG
151
12
0
15 Feb 2024
1