Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.01626
Cited By
Combining policy gradient and Q-learning
5 November 2016
Brendan O'Donoghue
Rémi Munos
Koray Kavukcuoglu
Volodymyr Mnih
OffRL
OnRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Combining policy gradient and Q-learning"
15 / 15 papers shown
Title
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
156
471
0
28 Feb 2017
Reward Augmented Maximum Likelihood for Neural Structured Prediction
Mohammad Norouzi
Samy Bengio
Zhiwen Chen
Navdeep Jaitly
M. Schuster
Yonghui Wu
Dale Schuurmans
77
253
0
01 Sep 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
191
8,850
0
04 Feb 2016
Taming the Noise in Reinforcement Learning via Soft Updates
Roy Fox
Ari Pakman
Naftali Tishby
70
338
0
28 Dec 2015
Policy Gradient Methods for Off-policy Control
Lucas Lehnert
Doina Precup
OffRL
33
4
0
13 Dec 2015
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
Tom Schaul
Matteo Hessel
H. V. Hasselt
Marc Lanctot
Nando de Freitas
OffRL
91
3,755
0
20 Nov 2015
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
214
3,787
0
18 Nov 2015
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
161
7,635
0
22 Sep 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
318
13,234
0
09 Sep 2015
End-to-End Training of Deep Visuomotor Policies
Sergey Levine
Chelsea Finn
Trevor Darrell
Pieter Abbeel
BDL
308
3,434
0
02 Apr 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,764
0
19 Feb 2015
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
Alex Graves
Ioannis Antonoglou
Daan Wierstra
Martin Riedmiller
123
12,227
0
19 Dec 2013
Revisiting Natural Gradient for Deep Networks
Razvan Pascanu
Yoshua Bengio
ODL
134
389
0
16 Jan 2013
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
109
3,004
0
19 Jul 2012
Dynamic Policy Programming
M. G. Azar
Vicencc Gómez
H. Kappen
109
123
0
12 Apr 2010
1