v1v2 (latest)

On the convergence of optimistic policy iteration for stochastic shortest path problem

27 August 2018

Abstract

In this paper, we prove some convergence results of a special case of optimistic policy iteration algorithm for stochastic shortest path problem. We consider both Monte Carlo and $TD(\lambda)$ methods for the policy evaluation step under the condition that the termination state will eventually be reached almost surely.

View on arXiv

Comments on this paper