Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2312.04724
Cited By
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
7 December 2023
Manish P Bhatt
Sahana Chennabasappa
Cyrus Nikolaidis
Shengye Wan
Ivan Evtimov
Dominik Gabi
Daniel Song
Faizan Ahmad
Cornelius Aschermann
Lorenzo Fontana
Sasha Frolov
Ravi Prakash Giri
Dhaval Kapil
Yiannis Kozyrakis
David LeBlanc
James Milazzo
Aleksandar Straumann
Gabriel Synnaeve
Varun Vontimitta
Spencer Whitman
Joshua Saxe
ELM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models"
16 / 16 papers shown
Title
Can You Really Trust Code Copilots? Evaluating Large Language Models from a Code Security Perspective
Yutao Mou
Xiao Deng
Yuxiao Luo
Shikun Zhang
Wei Ye
ELM
21
0
0
15 May 2025
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models
Huining Cui
Wei Liu
AAML
ELM
33
0
0
12 May 2025
SecRepoBench: Benchmarking LLMs for Secure Code Generation in Real-World Repositories
Connor Dilgren
Purva Chiniya
Luke Griffith
Yu Ding
Yizheng Chen
52
1
0
29 Apr 2025
The Digital Cybersecurity Expert: How Far Have We Come?
Dawei Wang
Geng Zhou
Xianglong Li
Yu Bai
Li Chen
Ting Qin
Jian Sun
Didong Li
ELM
74
0
0
16 Apr 2025
SandboxEval: Towards Securing Test Environment for Untrusted Code
Rafiqul Rabin
Jesse Hostetler
Sean McGregor
Brett Weir
Nick Judd
ELM
49
0
0
27 Mar 2025
A Framework for Evaluating Emerging Cyberattack Capabilities of AI
Mikel Rodriguez
Raluca Ada Popa
Four Flynn
Lihao Liang
Allan Dafoe
Anna Wang
ELM
74
5
0
14 Mar 2025
Transferable Foundation Models for Geometric Tasks on Point Cloud Representations: Geometric Neural Operators
Blaine Quackenbush
P. Atzberger
3DPC
AI4CE
73
0
0
06 Mar 2025
Lessons From Red Teaming 100 Generative AI Products
Blake Bullwinkel
Amanda Minnich
Shiven Chawla
Gary Lopez
Martin Pouliot
...
Pete Bryan
Ram Shankar Siva Kumar
Yonatan Zunger
Chang Kawaguchi
Mark Russinovich
AAML
VLM
47
5
0
13 Jan 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
87
1
0
09 Oct 2024
Root Defence Strategies: Ensuring Safety of LLM at the Decoding Level
Xinyi Zeng
Yuying Shang
Yutao Zhu
Jingyuan Zhang
Yu Tian
AAML
232
2
0
09 Oct 2024
APILOT: Navigating Large Language Models to Generate Secure Code by Sidestepping Outdated API Pitfalls
Weiheng Bai
Keyang Xuan
Pengxiang Huang
Qiushi Wu
Jianing Wen
Jingjing Wu
Kangjie Lu
LLMAG
KELM
40
1
0
25 Sep 2024
CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Shengye Wan
Cyrus Nikolaidis
Daniel Song
David Molnar
James Crnkovich
...
Spencer Whitman
Stephanie Ding
Vlad Ionescu
Yue Li
Joshua Saxe
ELM
44
21
0
02 Aug 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
52
36
0
06 May 2024
CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Manish P Bhatt
Sahana Chennabasappa
Yue Li
Cyrus Nikolaidis
Daniel Song
...
Yaohui Chen
Dhaval Kapil
David Molnar
Spencer Whitman
Joshua Saxe
ELM
40
36
0
19 Apr 2024
Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward
Xuan Xie
Jiayang Song
Zhehua Zhou
Yuheng Huang
Da Song
Lei Ma
OffRL
60
6
0
12 Apr 2024
SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety
Paul Röttger
Fabio Pernisi
Bertie Vidgen
Dirk Hovy
ELM
KELM
69
32
0
08 Apr 2024
1