Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1606.02647
Cited By
v1
v2 (latest)
Safe and Efficient Off-Policy Reinforcement Learning
8 June 2016
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Safe and Efficient Off-Policy Reinforcement Learning"
24 / 374 papers shown
Title
Learning with Options that Terminate Off-Policy
Anna Harutyunyan
Peter Vrancx
Pierre-Luc Bacon
Doina Precup
A. Nowé
OffRL
127
28
0
10 Nov 2017
A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
Marc Lanctot
V. Zambaldi
A. Gruslys
Angeliki Lazaridou
K. Tuyls
Julien Perolat
David Silver
T. Graepel
147
639
0
02 Nov 2017
On- and Off-Policy Monotonic Policy Improvement
R. Iwaki
Minoru Asada
OffRL
29
0
0
10 Oct 2017
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Marlos C. Machado
Marc G. Bellemare
Erik Talvitie
J. Veness
Matthew J. Hausknecht
Michael Bowling
114
558
0
18 Sep 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
143
2,830
0
19 Aug 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
96
164
0
01 Jun 2017
Convergent Tree Backup and Retrace with Function Approximation
Ahmed Touati
Pierre-Luc Bacon
Doina Precup
Pascal Vincent
106
40
0
25 May 2017
Guide Actor-Critic for Continuous Control
Voot Tangkaratt
A. Abdolmaleki
Masashi Sugiyama
67
17
0
22 May 2017
Discrete Sequential Prediction of Continuous Actions for Deep RL
Luke Metz
Julian Ibarz
Navdeep Jaitly
James Davidson
BDL
OffRL
92
120
0
14 May 2017
Investigating Recurrence and Eligibility Traces in Deep Q-Networks
J. Harb
Doina Precup
54
21
0
18 Apr 2017
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
A. Gruslys
Will Dabney
M. G. Azar
Bilal Piot
Marc G. Bellemare
Rémi Munos
76
58
0
15 Apr 2017
On Generalized Bellman Equations and Temporal-Difference Learning
Huizhen Yu
A. R. Mahmood
R. Sutton
118
29
0
14 Apr 2017
Deep Exploration via Randomized Value Functions
Ian Osband
Benjamin Van Roy
Daniel Russo
Zheng Wen
116
307
0
22 Mar 2017
Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation
Z. Guo
Philip S. Thomas
Emma Brunskill
OffRL
128
2
0
09 Mar 2017
Neural Episodic Control
Alexander Pritzel
Benigno Uria
Sriram Srinivasan
A. Badia
Oriol Vinyals
Demis Hassabis
Daan Wierstra
Charles Blundell
OffRL
BDL
113
346
0
06 Mar 2017
Count-Based Exploration with Neural Density Models
Georg Ostrovski
Marc G. Bellemare
Aaron van den Oord
Rémi Munos
104
626
0
03 Mar 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
203
478
0
28 Feb 2017
Reinforcement Learning Algorithm Selection
Romain Laroche
Raphael Feraud
OffRL
74
8
0
30 Jan 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
346
1,549
0
25 Jan 2017
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRL
BDL
106
345
0
07 Nov 2016
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening
Frank S. He
Yang Liu
Alex Schwing
Jian-wei Peng
91
84
0
05 Nov 2016
Sample Efficient Actor-Critic with Experience Replay
Ziyun Wang
V. Bapst
N. Heess
Volodymyr Mnih
Rémi Munos
Koray Kavukcuoglu
Nando de Freitas
129
762
0
03 Nov 2016
Unifying Count-Based Exploration and Intrinsic Motivation
Marc G. Bellemare
S. Srinivasan
Georg Ostrovski
Tom Schaul
D. Saxton
Rémi Munos
195
1,485
0
06 Jun 2016
Q(
λ
λ
λ
) with Off-Policy Corrections
Anna Harutyunyan
Marc G. Bellemare
T. Stepleton
Rémi Munos
OffRL
99
96
0
16 Feb 2016
Previous
1
2
3
4
5
6
7
8