Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.07584
Cited By
v1
v2 (latest)
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models
12 May 2025
Huining Cui
Wei Liu
AAML
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models"
9 / 9 papers shown
Title
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMs
Giulio Zizzo
Giandomenico Cornacchia
Kieran Fraser
Muhammad Zaid Hameed
Ambrish Rawat
Beat Buesser
Mark Purcell
Pin-Yu Chen
P. Sattigeri
Kush R. Varshney
AAML
104
5
0
24 Feb 2025
OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities
Michael Kouremetis
Marissa Dotter
Alex Byrne
Dan Martin
Ethan Michalak
Gianpaolo Russo
Michael Threet
Guido Zarrella
ELM
88
11
0
18 Feb 2025
Benchmarking Prompt Engineering Techniques for Secure Code Generation with GPT Models
Marc Bruni
Fabio Gabrielli
Mohammad Ghafari
Martin Kropp
SILM
72
6
0
09 Feb 2025
Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models
Jingwei Yi
Yueqi Xie
Bin Zhu
Emre Kiciman
Guangzhong Sun
Xing Xie
Fangzhao Wu
AAML
169
82
0
28 Jan 2025
PRISMe: A Novel LLM-Powered Tool for Interactive Privacy Policy Assessment
Vincent Freiberger
Arthur Fleig
Erik Buchmann
81
2
0
28 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
382
2,020
0
22 Jan 2025
CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation
Jinjun Peng
Leyi Cui
Kele Huang
Junfeng Yang
Baishakhi Ray
ELM
131
13
0
14 Jan 2025
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks
Samuele Poppi
Zheng-Xin Yong
Yifei He
Bobbie Chern
Han Zhao
Aobo Yang
Jianfeng Chi
AAML
158
21
0
23 Oct 2024
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
H. Zhang
Jingyuan Huang
Kai Mei
Yifei Yao
Zhenting Wang
Chenlu Zhan
Hongwei Wang
Yongfeng Zhang
AAML
LLMAG
ELM
189
40
0
03 Oct 2024
1