Conservative Agency via Attainable Utility Preservation

26 February 2019

Papers citing "Conservative Agency via Attainable Utility Preservation"

11 / 11 papers shown

Title
TraCeS: Trajectory Based Credit Assignment From Sparse Safety Feedback Siow Meng Low Akshat Kumar 48 0 0 17 Apr 2025
Societal Alignment Frameworks Can Improve LLM Alignment Karolina Stañczak Nicholas Meade Mehar Bhatia Hattie Zhou Konstantin Böttinger ... Timothy P. Lillicrap Ana Marasović Sylvie Delacroix Gillian K. Hadfield Siva Reddy 206 0 0 27 Feb 2025
The Benefits of Power Regularization in Cooperative Reinforcement Learning Michelle Li Michael Dennis 41 3 0 17 Jun 2024
Open-Endedness is Essential for Artificial Superhuman Intelligence Edward Hughes Michael Dennis Jack Parker-Holder Feryal M. P. Behbahani Aditi Mavalankar Yuge Shi Tom Schaul Tim Rocktaschel LRM 45 22 0 06 Jun 2024
Defining and Characterizing Reward Hacking Joar Skalse Nikolaus H. R. Howe Dmitrii Krasheninnikov David M. Krueger 59 56 0 27 Sep 2022
Improving performance in multi-objective decision-making in Bottles environments with soft maximin approaches Benjamin J. Smith Robert Klassert Roland Pihlakas 13 0 0 08 Aug 2022
Formalizing the Problem of Side Effect Regularization Alexander Matt Turner Aseem Saxena Prasad Tadepalli 24 2 0 23 Jun 2022
Avoiding Negative Side Effects due to Incomplete Knowledge of AI Systems Sandhya Saisubramanian S. Zilberstein Ece Kamar 14 21 0 24 Aug 2020
Avoiding Side Effects in Complex Environments Alexander Matt Turner Neale Ratzlaff Prasad Tadepalli 30 34 0 11 Jun 2020
Reinforcement Learning Under Moral Uncertainty Adrien Ecoffet Joel Lehman 17 32 0 08 Jun 2020
Safe Exploration in Markov Decision Processes T. Moldovan Pieter Abbeel 78 308 0 22 May 2012