Good and safe uses of AI Oracles

15 November 2017

Papers citing "Good and safe uses of AI Oracles"

7 / 7 papers shown

Title
Negative Human Rights as a Basis for Long-term AI Safety and Regulation Ondrej Bajgar Jan Horenovsky FaML 24 9 0 31 Aug 2022
Scoring Rules for Performative Binary Prediction Alan Chan 31 1 0 05 Jul 2022
Goal Misgeneralization in Deep Reinforcement Learning L. Langosco Jack Koch Lee D. Sharkey J. Pfau Laurent Orseau David M. Krueger 30 78 0 28 May 2021
Avoiding Tampering Incentives in Deep RL via Decoupled Approval J. Uesato Ramana Kumar Victoria Krakovna Tom Everitt Richard Ngo Shane Legg 28 14 0 17 Nov 2020
REALab: An Embedded Perspective on Tampering Ramana Kumar J. Uesato Richard Ngo Tom Everitt Victoria Krakovna Shane Legg 30 10 0 17 Nov 2020
Hidden Incentives for Auto-Induced Distributional Shift David M. Krueger Tegan Maharaj Jan Leike 13 49 0 19 Sep 2020
AGI Safety Literature Review Tom Everitt G. Lea Marcus Hutter AI4CE 36 115 0 03 May 2018