An Empirical Comparison of Off-policy Prediction Learning Algorithms on
the Collision Task

An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task

2 June 2021

Papers citing "An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task"

16 / 16 papers shown

Title
Gradient Temporal-Difference Learning with Regularized Corrections Sina Ghiassian Andrew Patterson Shivam Garg Dhawal Gupta Adam White Martha White 116 42 0 01 Jul 2020
Finite-Sample Analysis of Proximal Gradient TD Algorithms Bo Liu Ji Liu Mohammad Ghavamzadeh Sridhar Mahadevan Marek Petrik 50 158 0 06 Jun 2020
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures L. Espeholt Hubert Soyer Rémi Munos Karen Simonyan Volodymyr Mnih ... Vlad Firoiu Tim Harley Iain Dunning Shane Legg Koray Kavukcuoglu 191 1,594 0 05 Feb 2018
Convergent Tree Backup and Retrace with Function Approximation Ahmed Touati Pierre-Luc Bacon Doina Precup Pascal Vincent 83 40 0 25 May 2017
A First Empirical Study of Emphatic Temporal Difference Learning Sina Ghiassian Banafsheh Rafiee R. Sutton OffRL 35 14 0 11 May 2017
On Generalized Bellman Equations and Temporal-Difference Learning Huizhen Yu A. R. Mahmood R. Sutton 109 29 0 14 Apr 2017
Multi-step Off-policy Learning Without Importance Sampling Ratios A. R. Mahmood Huizhen Yu R. Sutton OffRL 108 54 0 09 Feb 2017
Reinforcement Learning with Unsupervised Auxiliary Tasks Max Jaderberg Volodymyr Mnih Wojciech M. Czarnecki Tom Schaul Joel Z Leibo David Silver Koray Kavukcuoglu SSL 90 1,228 0 16 Nov 2016
Unifying task specification in reinforcement learning Martha White OffRL 44 89 0 07 Sep 2016
Safe and Efficient Off-Policy Reinforcement Learning Rémi Munos T. Stepleton Anna Harutyunyan Marc G. Bellemare OffRL 138 615 0 08 Jun 2016
Investigating practical linear temporal difference learning Adam White Martha White OffRL 50 41 0 28 Feb 2016
Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis Assaf Hallak Aviv Tamar Rémi Munos Shie Mannor OffRL 91 56 0 17 Sep 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning R. Sutton A. R. Mahmood Martha White 82 269 0 14 Mar 2015
Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces Sridhar Mahadevan Bo Liu Philip S. Thomas Will Dabney S. Giguere Nicholas Jacek I. Gemp Ji Liu 41 67 0 26 May 2014
Off-policy Learning with Eligibility Traces: A Survey Matthieu Geist B. Scherrer OffRL 83 94 0 15 Apr 2013
Solving variational inequalities with Stochastic Mirror-Prox algorithm A. Juditsky A. Nemirovskii Claire Tauvel 125 442 0 04 Sep 2008