Intermittently Observable Markov Decision Processes

17 February 2025

Abstract

This paper investigates MDPs with intermittent state information. We consider a scenario where the controller perceives the state information of the process via an unreliable communication channel. The transmissions of state information over the whole time horizon are modeled as a Bernoulli lossy process. Hence, the problem is finding an optimal policy for selecting actions in the presence of state information losses. We first formulate the problem as a belief MDP to establish structural results. The effect of state information losses on the expected total discounted reward is studied systematically. Then, we reformulate the problem as a tree MDP whose state space is organized in a tree structure. Two finite-state approximations to the tree MDP are developed to find near-optimal policies efficiently. Finally, we put forth a nested value iteration algorithm for the finite-state approximations, which is proved to be faster than standard value iteration. Numerical results demonstrate the effectiveness of our methods.

View on arXiv

@article{chen2025_2302.11761,
  title={ Intermittently Observable Markov Decision Processes },
  author={ Gongpu Chen and Soung-Chang Liew },
  journal={arXiv preprint arXiv:2302.11761},
  year={ 2025 }
}

Comments on this paper