ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.06365
  4. Cited By
Índifference' methods for managing agent rewards
v1v2v3v4 (latest)

Índifference' methods for managing agent rewards

18 December 2017
Stuart Armstrong
Xavier O'Rourke
ArXiv (abs)PDFHTML

Papers citing "Índifference' methods for managing agent rewards"

13 / 13 papers shown
Title
Towards shutdownable agents via stochastic choice
Towards shutdownable agents via stochastic choice
Elliott Thornley
Alexander Roman
Christos Ziakas
Leyton Ho
Louis Thomson
120
0
0
30 Jun 2024
AGI Safety Literature Review
AGI Safety Literature Review
Tom Everitt
G. Lea
Marcus Hutter
AI4CE
76
116
0
03 May 2018
Counterfactual equivalence for POMDPs, and underlying deterministic
  environments
Counterfactual equivalence for POMDPs, and underlying deterministic environments
Stuart Armstrong
47
2
0
11 Jan 2018
AI Safety Gridworlds
AI Safety Gridworlds
Jan Leike
Miljan Martic
Victoria Krakovna
Pedro A. Ortega
Tom Everitt
Andrew Lefrancq
Laurent Orseau
Shane Legg
138
255
0
27 Nov 2017
Good and safe uses of AI Oracles
Good and safe uses of AI Oracles
Stuart Armstrong
Xavier O'Rorke
138
27
0
15 Nov 2017
Inverse Reward Design
Inverse Reward Design
Dylan Hadfield-Menell
S. Milli
Pieter Abbeel
Stuart J. Russell
Anca Dragan
93
400
0
08 Nov 2017
Deep reinforcement learning from human preferences
Deep reinforcement learning from human preferences
Paul Christiano
Jan Leike
Tom B. Brown
Miljan Martic
Shane Legg
Dario Amodei
220
3,380
0
12 Jun 2017
Should Robots be Obedient?
Should Robots be Obedient?
S. Milli
Dylan Hadfield-Menell
Anca Dragan
Stuart J. Russell
69
59
0
28 May 2017
Enter the Matrix: Safely Interruptible Autonomous Systems via
  Virtualization
Enter the Matrix: Safely Interruptible Autonomous Systems via Virtualization
Mark O. Riedl
Brent Harrison
47
7
0
30 Mar 2017
Concrete Problems in AI Safety
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
260
2,405
0
21 Jun 2016
Self-Modification of Policy and Utility Function in Rational Agents
Self-Modification of Policy and Utility Function in Rational Agents
Tom Everitt
Daniel Filan
Mayank Daswani
Marcus Hutter
59
29
0
10 May 2016
Learning the Preferences of Ignorant, Inconsistent Agents
Learning the Preferences of Ignorant, Inconsistent Agents
Owain Evans
Andreas Stuhlmuller
Noah D. Goodman
78
128
0
18 Dec 2015
Can Intelligence Explode?
Can Intelligence Explode?
Marcus Hutter
ELM
66
29
0
28 Feb 2012
1