Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1611.02247
Cited By
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic
7 November 2016
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
OffRL
BDL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic"
46 / 196 papers shown
Title
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods
Deirdre Quillen
Eric Jang
Ofir Nachum
Chelsea Finn
Julian Ibarz
Sergey Levine
OOD
OffRL
23
202
0
28 Feb 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
30
126
0
27 Feb 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
Kaipeng Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
43
582
0
23 Feb 2018
Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation
H. Maei
OffRL
8
32
0
21 Feb 2018
Clipped Action Policy Gradient
Yasuhiro Fujita
S. Maeda
OffRL
34
37
0
21 Feb 2018
Fourier Policy Gradients
M. Fellows
K. Ciosek
Shimon Whiteson
35
15
0
19 Feb 2018
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement Learning
Qingkai Liang
Fanyu Que
E. Modiano
13
101
0
19 Feb 2018
Reinforcement Learning from Imperfect Demonstrations
Yang Gao
Huazhe Xu
Ji Lin
Feng Yu
Sergey Levine
Trevor Darrell
18
199
0
14 Feb 2018
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Cédric Colas
Olivier Sigaud
Pierre-Yves Oudeyer
29
157
0
14 Feb 2018
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
Yelong Shen
Jianshu Chen
Po-Sen Huang
Yuqing Guo
Jianfeng Gao
29
127
0
12 Feb 2018
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Xiaoqin Zhang
Huimin Ma
OffRL
29
38
0
31 Jan 2018
Experience-driven Networking: A Deep Reinforcement Learning based Approach
Zhiyuan Xu
Jian Tang
Jingsong Meng
Weiyi Zhang
Yanzhi Wang
C. Liu
Dejun Yang
OffRL
27
359
0
17 Jan 2018
Expected Policy Gradients for Reinforcement Learning
K. Ciosek
Shimon Whiteson
50
51
0
10 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
23
8,158
0
04 Jan 2018
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang
Richard Liaw
Philipp Moritz
Robert Nishihara
Roy Fox
Ken Goldberg
Joseph E. Gonzalez
Michael I. Jordan
Ion Stoica
OffRL
AI4CE
31
173
0
26 Dec 2017
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator
Stephen Tu
Benjamin Recht
OffRL
32
130
0
22 Dec 2017
Action Branching Architectures for Deep Reinforcement Learning
Arash Tavakoli
Fabio Pardo
Petar Kormushev
22
258
0
24 Nov 2017
How Generative Adversarial Networks and Their Variants Work: An Overview
Yongjun Hong
Uiwon Hwang
Jaeyoon Yoo
Sungroh Yoon
GAN
41
153
0
16 Nov 2017
Costate-focused models for reinforcement learning
B. Behrouzi
Xuefei Liu
D. Tweed
OffRL
18
0
0
15 Nov 2017
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
56
300
0
31 Oct 2017
Action-depedent Control Variates for Policy Optimization via Stein's Identity
Hao Liu
Yihao Feng
Yi Mao
Dengyong Zhou
Jian-wei Peng
Qiang Liu
26
4
0
30 Oct 2017
On- and Off-Policy Monotonic Policy Improvement
R. Iwaki
Minoru Asada
OffRL
20
0
0
10 Oct 2017
Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning
Maximilian Hüttenrauch
Adrian Šošić
Gerhard Neumann
6
3
0
21 Sep 2017
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
D. Meger
OffRL
45
1,932
0
19 Sep 2017
Mean Actor Critic
Cameron Allen
Kavosh Asadi
Melrose Roderick
Abdel-rahman Mohamed
George Konidaris
Michael Littman
20
44
0
01 Sep 2017
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
59
2,776
0
19 Aug 2017
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
22
622
0
17 Aug 2017
Benchmark Environments for Multitask Learning in Continuous Domains
Peter Henderson
Wei-Di Chang
Florian Shkurti
Johanna Hansen
D. Meger
Gregory Dudek
6
40
0
14 Aug 2017
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
Riashat Islam
Peter Henderson
Maziar Gomrokchi
Doina Precup
BDL
OffRL
8
251
0
10 Aug 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
9
965
0
08 Aug 2017
Deep Reinforcement Learning Attention Selection for Person Re-Identification
Xu Lan
Hanxiao Wang
S. Gong
Xiatian Zhu
18
6
0
10 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
19
106
0
06 Jul 2017
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
17
57
0
15 Jun 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
21
164
0
01 Jun 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
9
1,301
0
30 May 2017
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Guan-Horng Liu
Avinash Siravuru
Sai P. Selvaraj
Manuela Veloso
George Kantor
8
69
0
30 May 2017
Enhanced Experience Replay Generation for Efficient Reinforcement Learning
Vincent Huang
Tobias Ley
Martha Vlachou-Konchylaki
Wenfeng Hu
OnRL
GAN
SyDa
13
9
0
23 May 2017
Stein Variational Policy Gradient
Yang Liu
Prajit Ramachandran
Qiang Liu
Jian-wei Peng
14
138
0
07 Apr 2017
Learning Combinatorial Optimization Algorithms over Graphs
H. Dai
Elias Boutros Khalil
Yuyu Zhang
B. Dilkina
Le Song
26
1,444
0
05 Apr 2017
Learning to Navigate Cloth using Haptics
Alexander Clegg
Wenhao Yu
Zackory M. Erickson
Jie Tan
Chenxi Liu
Greg Turk
18
23
0
20 Mar 2017
Towards Generalization and Simplicity in Continuous Control
Aravind Rajeswaran
Kendall Lowrey
E. Todorov
Sham Kakade
OffRL
26
276
0
08 Mar 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
15
465
0
28 Feb 2017
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
26
1,314
0
27 Feb 2017
Preparing for the Unknown: Learning a Universal Policy with Online System Identification
Wenhao Yu
Jie Tan
Chenxi Liu
Greg Turk
OffRL
15
305
0
08 Feb 2017
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,503
0
25 Jan 2017
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
187
603
0
22 Sep 2016
Previous
1
2
3
4