Modeling AGI Safety Frameworks with Causal Influence Diagrams

20 June 2019

Papers citing "Modeling AGI Safety Frameworks with Causal Influence Diagrams"

11 / 11 papers shown

Title
Scalable agent alignment via reward modeling: a research direction Jan Leike David M. Krueger Tom Everitt Miljan Martic Vishal Maini Shane Legg 86 413 0 19 Nov 2018
Supervising strong learners by amplifying weak experts Paul Christiano Buck Shlegeris Dario Amodei 56 120 0 19 Oct 2018
Agents and Devices: A Relative Definition of Agency Laurent Orseau Simon McGregor McGill Shane Legg 36 16 0 31 May 2018
AGI Safety Literature Review Tom Everitt G. Lea Marcus Hutter AI4CE 55 116 0 03 May 2018
AI safety via debate G. Irving Paul Christiano Dario Amodei 234 217 0 02 May 2018
Good and safe uses of AI Oracles Stuart Armstrong Xavier O'Rorke 110 27 0 15 Nov 2017
Learning to reinforcement learn Jane X. Wang Z. Kurth-Nelson Dhruva Tirumala Hubert Soyer Joel Z Leibo Rémi Munos Charles Blundell D. Kumaran M. Botvinick OffRL 97 977 0 17 Nov 2016
Concrete Problems in AI Safety Dario Amodei C. Olah Jacob Steinhardt Paul Christiano John Schulman Dandelion Mané 195 2,380 0 21 Jun 2016
Cooperative Inverse Reinforcement Learning Dylan Hadfield-Menell Anca Dragan Pieter Abbeel Stuart J. Russell 65 645 0 09 Jun 2016
Self-Modification of Policy and Utility Function in Rational Agents Tom Everitt Daniel Filan Mayank Daswani Marcus Hutter 43 29 0 10 May 2016
Model-based Utility Functions B. Hibbard 104 48 0 16 Nov 2011