An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning

14 March 2015

Papers citing "An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning"

13 / 13 papers shown

Title
Divergence-Augmented Policy Optimization Qing Wang Yingru Li Jiechao Xiong Tong Zhang OffRL 123 16 0 28 Jan 2025
MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL C. Voelcker Marcel Hussing Eric Eaton Amir-massoud Farahmand Igor Gilitschenski 72 4 0 11 Oct 2024
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning Claire Chen Shuze Liu Shangtong Zhang OffRL 287 1 0 08 Oct 2024
Doubly Optimal Policy Evaluation for Reinforcement Learning Shuze Liu Claire Chen Shangtong Zhang OffRL 121 3 0 03 Oct 2024
DeepAveragers: Offline Reinforcement Learning by Solving Derived Non-Parametric MDPs Aayam Shrestha Stefan Lee Prasad Tadepalli Alan Fern OffRL 80 23 0 18 Oct 2020
Consistent On-Line Off-Policy Evaluation Assaf Hallak Shie Mannor OffRL 49 93 0 23 Feb 2017
Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize Huizhen Yu 39 30 0 23 Nov 2015
Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis Assaf Hallak Aviv Tamar Rémi Munos Shie Mannor OffRL 70 56 0 17 Sep 2015
True Online Emphatic TD( $λ$ ): Quick Reference and Implementation Guide R. Sutton OffRL 23 1 0 25 Jul 2015
Emphatic Temporal-Difference Learning A. R. Mahmood Huizhen Yu Martha White R. Sutton 90 33 0 06 Jul 2015
On Convergence of Emphatic Temporal-Difference Learning Huizhen Yu OffRL 36 73 0 08 Jun 2015
Off-policy Learning with Eligibility Traces: A Survey Matthieu Geist B. Scherrer OffRL 48 94 0 15 Apr 2013
Multi-timescale Nexting in a Reinforcement Learning Robot Joseph Modayil Adam White R. Sutton 157 130 0 06 Dec 2011