Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.10891
Cited By
Soft Q Network
20 December 2019
Jingbin Liu
Shuai Liu
Xinyang Gu
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Soft Q Network"
19 / 19 papers shown
Title
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
116
2,418
0
13 Dec 2018
Learning Latent Dynamics for Planning from Pixels
Danijar Hafner
Timothy Lillicrap
Ian S. Fischer
Ruben Villegas
David R Ha
Honglak Lee
James Davidson
BDL
84
1,429
0
12 Nov 2018
Distributed Prioritized Experience Replay
Dan Horgan
John Quan
David Budden
Gabriel Barth-Maron
Matteo Hessel
H. V. Hasselt
David Silver
141
740
0
02 Mar 2018
Addressing Function Approximation Error in Actor-Critic Methods
Scott Fujimoto
H. V. Hoof
David Meger
OffRL
164
5,161
0
26 Feb 2018
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
L. Espeholt
Hubert Soyer
Rémi Munos
Karen Simonyan
Volodymyr Mnih
...
Vlad Firoiu
Tim Harley
Iain Dunning
Shane Legg
Koray Kavukcuoglu
173
1,593
0
05 Feb 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
248
8,303
0
04 Jan 2018
Recurrent Deterministic Policy Gradient Method for Bipedal Locomotion on Rough Terrain Challenge
Doo Re Song
Chuanyu Yang
C. McGreavy
Zhibin Li
99
30
0
08 Oct 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
David Meger
OffRL
114
1,945
0
19 Sep 2017
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
85
1,499
0
21 Jul 2017
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
314
18,906
0
20 Jul 2017
Noisy Networks for Exploration
Meire Fortunato
M. G. Azar
Bilal Piot
Jacob Menick
Ian Osband
...
Rémi Munos
Demis Hassabis
Olivier Pietquin
Charles Blundell
Shane Legg
79
893
0
30 Jun 2017
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRL
BDL
88
345
0
07 Nov 2016
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
173
8,832
0
04 Feb 2016
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
Tom Schaul
Matteo Hessel
H. V. Hasselt
Marc Lanctot
Nando de Freitas
OffRL
79
3,749
0
20 Nov 2015
Prioritized Experience Replay
Tom Schaul
John Quan
Ioannis Antonoglou
David Silver
OffRL
210
3,786
0
18 Nov 2015
Deep Reinforcement Learning with Double Q-learning
H. V. Hasselt
A. Guez
David Silver
OffRL
146
7,621
0
22 Sep 2015
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
278
13,208
0
09 Sep 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
257
6,749
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.3K
149,820
0
22 Dec 2014
1