Avoiding Wireheading with Value Reinforcement Learning

Avoiding Wireheading with Value Reinforcement Learning

10 May 2016

Marcus Hutter

Papers citing "Avoiding Wireheading with Value Reinforcement Learning"

8 / 8 papers shown

Title
MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking Sebastian Farquhar Vikrant Varma David Lindner David Elson Caleb Biddulph Ian Goodfellow Rohin Shah 141 2 0 22 Jan 2025
Death and Suicide in Universal Artificial Intelligence Jarryd Martin Tom Everitt Marcus Hutter 29 21 0 02 Jun 2016
Self-Modification of Policy and Utility Function in Rational Agents Tom Everitt Daniel Filan Mayank Daswani Marcus Hutter 48 29 0 10 May 2016
Towards Resolving Unidentifiability in Inverse Reinforcement Learning Kareem Amin Satinder Singh 41 30 0 25 Jan 2016
Learning the Preferences of Ignorant, Inconsistent Agents Owain Evans Andreas Stuhlmuller Noah D. Goodman 48 128 0 18 Dec 2015
Sequential Extensions of Causal and Evidential Decision Theory Tom Everitt Jan Leike Marcus Hutter CML 41 15 0 24 Jun 2015
Model-based Utility Functions B. Hibbard 106 48 0 16 Nov 2011
Universal Intelligence: A Definition of Machine Intelligence Shane Legg Marcus Hutter 98 641 0 20 Dec 2007