Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2106.00922
Cited By
An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task
2 June 2021
Sina Ghiassian
R. Sutton
AAML
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task"
16 / 16 papers shown
Title
Gradient Temporal-Difference Learning with Regularized Corrections
Sina Ghiassian
Andrew Patterson
Shivam Garg
Dhawal Gupta
Adam White
Martha White
116
42
0
01 Jul 2020
Finite-Sample Analysis of Proximal Gradient TD Algorithms
Bo Liu
Ji Liu
Mohammad Ghavamzadeh
Sridhar Mahadevan
Marek Petrik
50
158
0
06 Jun 2020
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
L. Espeholt
Hubert Soyer
Rémi Munos
Karen Simonyan
Volodymyr Mnih
...
Vlad Firoiu
Tim Harley
Iain Dunning
Shane Legg
Koray Kavukcuoglu
191
1,594
0
05 Feb 2018
Convergent Tree Backup and Retrace with Function Approximation
Ahmed Touati
Pierre-Luc Bacon
Doina Precup
Pascal Vincent
83
40
0
25 May 2017
A First Empirical Study of Emphatic Temporal Difference Learning
Sina Ghiassian
Banafsheh Rafiee
R. Sutton
OffRL
35
14
0
11 May 2017
On Generalized Bellman Equations and Temporal-Difference Learning
Huizhen Yu
A. R. Mahmood
R. Sutton
109
29
0
14 Apr 2017
Multi-step Off-policy Learning Without Importance Sampling Ratios
A. R. Mahmood
Huizhen Yu
R. Sutton
OffRL
108
54
0
09 Feb 2017
Reinforcement Learning with Unsupervised Auxiliary Tasks
Max Jaderberg
Volodymyr Mnih
Wojciech M. Czarnecki
Tom Schaul
Joel Z Leibo
David Silver
Koray Kavukcuoglu
SSL
90
1,228
0
16 Nov 2016
Unifying task specification in reinforcement learning
Martha White
OffRL
44
89
0
07 Sep 2016
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
138
615
0
08 Jun 2016
Investigating practical linear temporal difference learning
Adam White
Martha White
OffRL
50
41
0
28 Feb 2016
Generalized Emphatic Temporal Difference Learning: Bias-Variance Analysis
Assaf Hallak
Aviv Tamar
Rémi Munos
Shie Mannor
OffRL
91
56
0
17 Sep 2015
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
R. Sutton
A. R. Mahmood
Martha White
82
269
0
14 Mar 2015
Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces
Sridhar Mahadevan
Bo Liu
Philip S. Thomas
Will Dabney
S. Giguere
Nicholas Jacek
I. Gemp
Ji Liu
41
67
0
26 May 2014
Off-policy Learning with Eligibility Traces: A Survey
Matthieu Geist
B. Scherrer
OffRL
83
94
0
15 Apr 2013
Solving variational inequalities with Stochastic Mirror-Prox algorithm
A. Juditsky
A. Nemirovskii
Claire Tauvel
125
442
0
04 Sep 2008
1