Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.11036
Cited By
garak: A Framework for Security Probing Large Language Models
16 June 2024
Leon Derczynski
Erick Galinkin
Jeffrey Martin
Subho Majumdar
Nanna Inie
AAML
ELM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"garak: A Framework for Security Probing Large Language Models"
9 / 9 papers shown
Title
JavelinGuard: Low-Cost Transformer Architectures for LLM Security
Yash Datta
Sharath Rajasekar
23
0
0
09 Jun 2025
Analysing Safety Risks in LLMs Fine-Tuned with Pseudo-Malicious Cyber Security Data
Adel ElZemity
Budi Arief
Shujun Li
89
0
0
15 May 2025
OET: Optimization-based prompt injection Evaluation Toolkit
Jinsheng Pan
Xiaogeng Liu
Chaowei Xiao
AAML
198
0
0
01 May 2025
aiXamine: Simplified LLM Safety and Security
Fatih Deniz
Dorde Popovic
Yazan Boshmaf
Euisuh Jeong
M. Ahmad
Sanjay Chawla
Issa M. Khalil
ELM
346
0
0
21 Apr 2025
A Framework for Evaluating Emerging Cyberattack Capabilities of AI
Mikel Rodriguez
Raluca Ada Popa
Four Flynn
Lihao Liang
Allan Dafoe
Anna Wang
ELM
155
8
0
14 Mar 2025
Probing Latent Subspaces in LLM for AI Security: Identifying and Manipulating Adversarial States
Xin Wei Chia
Swee Liang Wong
Jonathan Pan
AAML
75
1
0
12 Mar 2025
Prompt Inject Detection with Generative Explanation as an Investigative Tool
Jonathan Pan
Swee Liang Wong
Yidi Yuan
Xin Wei Chia
SILM
132
0
0
16 Feb 2025
Bridging Today and the Future of Humanity: AI Safety in 2024 and Beyond
Shanshan Han
175
1
0
09 Oct 2024
FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts
Yichen Gong
Delong Ran
Jinyuan Liu
Conglei Wang
Tianshuo Cong
Anyu Wang
Sisi Duan
Xiaoyun Wang
MLLM
240
161
0
09 Nov 2023
1