Formalizing Embeddedness Failures in Universal Artificial Intelligence

23 May 2025

Abstract

We rigorously discuss the commonly asserted failures of the AIXI reinforcement learning agent as a model of embedded agency. We attempt to formalize these failure modes and prove that they occur within the framework of universal artificial intelligence, focusing on a variant of AIXI that models the joint action/percept history as drawn from the universal distribution. We also evaluate the progress that has been made towards a successful theory of embedded agency based on variants of the AIXI agent.

View on arXiv

@article{wyeth2025_2505.17882,
  title={ Formalizing Embeddedness Failures in Universal Artificial Intelligence },
  author={ Cole Wyeth and Marcus Hutter },
  journal={arXiv preprint arXiv:2505.17882},
  year={ 2025 }
}

Comments on this paper