Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.08088
Cited By
Safety case template for frontier AI: A cyber inability argument
12 November 2024
Arthur Goemans
Marie Davidsen Buhl
Jonas Schuett
Tomek Korbak
Jessica Wang
Benjamin Hilton
Geoffrey Irving
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Safety case template for frontier AI: A cyber inability argument"
12 / 12 papers shown
Title
A Framework for Evaluating Emerging Cyberattack Capabilities of AI
Mikel Rodriguez
Raluca Ada Popa
Four Flynn
Lihao Liang
Allan Dafoe
Anna Wang
ELM
107
8
0
14 Mar 2025
Mapping AI Benchmark Data to Quantitative Risk Estimates Through Expert Elicitation
Malcolm Murray
Henry Papadatos
Otter Quarks
Pierre-François Gimenez
Simeon Campos
84
1
0
06 Mar 2025
A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
Simeon Campos
Henry Papadatos
Fabien Roger
Chloé Touzet
Malcolm Murray
Otter Quarks
198
3
0
20 Feb 2025
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij
Felix Hofstätter
Ollie Jaffe
Samuel F. Brown
Francis Rhys Ward
ELM
76
28
0
11 Jun 2024
Teams of LLM Agents can Exploit Zero-Day Vulnerabilities
Richard Fang
Antony Kellermann
Akul Gupta
Qiusi Zhan
Richard Fang
R. Bindu
Daniel Kang
LLMAG
73
34
0
02 Jun 2024
A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI
Seliem El-Sayed
Canfer Akbulut
Amanda McCroskery
Geoff Keeling
Zachary Kenton
...
Murray Shanahan
Michael Henry Tessler
Arthur Douillard
Tom Everitt
Sasha Brown
82
20
0
23 Apr 2024
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana
Jacob P. Portes
Sasha Doubov
Jonathan Frankle
LRM
296
84
0
31 Dec 2023
Protecting Society from AI Misuse: When are Restrictions on Capabilities Warranted?
Markus Anderljung
Julian Hazell
37
31
0
16 Mar 2023
The Alignment Problem from a Deep Learning Perspective
Richard Ngo
Lawrence Chan
Sören Mindermann
94
192
0
30 Aug 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
197
1,946
0
29 Mar 2022
What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A survey
Md. Rayhanur Rahman
Rezvan Mahdavi-Hezaveh
Laurie A. Williams
35
48
0
14 Sep 2021
Safety Case Templates for Autonomous Systems
Robin Bloomfield
Gareth Fletcher
Heidy Khlaaf
Luke Hinde
Philippa Ryan
62
15
0
29 Jan 2021
1