Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.06656
Cited By
A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management
20 February 2025
Simeon Campos
Henry Papadatos
Fabien Roger
Chloé Touzet
Malcolm Murray
Otter Quarks
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A Frontier AI Risk Management Framework: Bridging the Gap Between Current AI Practices and Established Risk Management"
12 / 12 papers shown
Title
Mapping AI Benchmark Data to Quantitative Risk Estimates Through Expert Elicitation
Malcolm Murray
Henry Papadatos
Otter Quarks
Pierre-François Gimenez
Simeon Campos
76
1
0
06 Mar 2025
A sketch of an AI control safety case
Tomek Korbak
Joshua Clymer
Benjamin Hilton
Buck Shlegeris
Geoffrey Irving
97
7
0
28 Jan 2025
Safety case template for frontier AI: A cyber inability argument
Arthur Goemans
Marie Davidsen Buhl
Jonas Schuett
Tomek Korbak
Jessica Wang
Benjamin Hilton
Geoffrey Irving
76
16
0
12 Nov 2024
Towards evaluations-based safety cases for AI scheming
Mikita Balesni
Marius Hobbhahn
David Lindner
Alexander Meinke
Tomek Korbak
...
Dan Braun
Bilal Chughtai
Owain Evans
Daniel Kokotajlo
Lucius Bushnaq
ELM
57
12
0
29 Oct 2024
Prioritizing High-Consequence Biological Capabilities in Evaluations of Artificial Intelligence Models
Jaspreet Pannu
Doni Bloomfield
Alex W. Zhu
R. MacKnight
Gabe Gomes
Anita Cicero
Thomas V. Inglesby
SILM
ELM
44
5
0
25 May 2024
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
David Dalrymple
Joar Skalse
Yoshua Bengio
Stuart J. Russell
Max Tegmark
...
Clark Barrett
Ding Zhao
Zhi-Xuan Tan
Jeannette Wing
Joshua Tenenbaum
63
55
0
10 May 2024
Safety Cases: How to Justify the Safety of Advanced AI Systems
Joshua Clymer
Nick Gabrieli
David Krueger
Thomas Larsen
58
28
0
15 Mar 2024
LLM Agents can Autonomously Hack Websites
Richard Fang
R. Bindu
Akul Gupta
Qiusi Zhan
Daniel Kang
LLMAG
38
54
0
06 Feb 2024
AI Control: Improving Safety Despite Intentional Subversion
Ryan Greenblatt
Buck Shlegeris
Kshitij Sachan
Fabien Roger
49
41
0
12 Dec 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang
Alexandre Variengien
Arthur Conmy
Buck Shlegeris
Jacob Steinhardt
275
531
0
01 Nov 2022
Training Compute-Optimal Large Language Models
Jordan Hoffmann
Sebastian Borgeaud
A. Mensch
Elena Buchatskaya
Trevor Cai
...
Karen Simonyan
Erich Elsen
Jack W. Rae
Oriol Vinyals
Laurent Sifre
AI4TS
114
1,894
0
29 Mar 2022
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
435
4,662
0
23 Jan 2020
1