ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.08864
87
0

Off-Switching Not Guaranteed

13 February 2025
Sven Neth
ArXivPDFHTML
Abstract

Hadfield-Menell et al. (2017) propose the Off-Switch Game, a model of Human-AI cooperation in which AI agents always defer to humans because they are uncertain about our preferences. I explain two reasons why AI agents might not defer. First, AI agents might not value learning. Second, even if AI agents value learning, they might not be certain to learn our actual preferences.

View on arXiv
@article{neth2025_2502.08864,
  title={ Off-Switching Not Guaranteed },
  author={ Sven Neth },
  journal={arXiv preprint arXiv:2502.08864},
  year={ 2025 }
}
Comments on this paper