Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize

23 November 2015

Huizhen Yu

Papers citing "Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize"

2 / 2 papers shown

Title
Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning Wenzhuo Zhou Ruoqing Zhu Annie Qu 37 22 0 20 Oct 2021
Multi-step Off-policy Learning Without Importance Sampling Ratios A. R. Mahmood Huizhen Yu R. Sutton OffRL 16 54 0 09 Feb 2017