'Indifference' methods for managing agent rewards

`Indifference' refers to a class of methods that are used to control a reward based agent. These methods of control work even if the implications of the agent's reward are otherwise not fully understood. Though they all come out of similar ideas, indifference techniques can be classified as way of achieving one or more of three distinct goals: rewards dependent on certain events (with no motivation for the agent to manipulate the probability of those events), effective disbelief that an event will ever occur, and seamless transition from one behaviour to another. This paper analyses methods of achieving these goals in the POMDP setting, and establishes their uses, strengths, and limitations. It aims to make the tools of indifference generally accessible and usable to agent designers.
View on arXiv