ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.02247
  4. Cited By
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

7 November 2016
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
    OffRL
    BDL
ArXivPDFHTML

Papers citing "Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic"

46 / 196 papers shown
Title
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A
  Simulated Comparative Evaluation of Off-Policy Methods
Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods
Deirdre Quillen
Eric Jang
Ofir Nachum
Chelsea Finn
Julian Ibarz
Sergey Levine
OOD
OffRL
23
202
0
28 Feb 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
30
126
0
27 Feb 2018
Fully Decentralized Multi-Agent Reinforcement Learning with Networked
  Agents
Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents
Kaipeng Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
43
582
0
23 Feb 2018
Convergent Actor-Critic Algorithms Under Off-Policy Training and
  Function Approximation
Convergent Actor-Critic Algorithms Under Off-Policy Training and Function Approximation
H. Maei
OffRL
8
32
0
21 Feb 2018
Clipped Action Policy Gradient
Clipped Action Policy Gradient
Yasuhiro Fujita
S. Maeda
OffRL
34
37
0
21 Feb 2018
Fourier Policy Gradients
Fourier Policy Gradients
M. Fellows
K. Ciosek
Shimon Whiteson
35
15
0
19 Feb 2018
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement
  Learning
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement Learning
Qingkai Liang
Fanyu Que
E. Modiano
13
101
0
19 Feb 2018
Reinforcement Learning from Imperfect Demonstrations
Reinforcement Learning from Imperfect Demonstrations
Yang Gao
Huazhe Xu
Ji Lin
Feng Yu
Sergey Levine
Trevor Darrell
18
199
0
14 Feb 2018
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement
  Learning Algorithms
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Cédric Colas
Olivier Sigaud
Pierre-Yves Oudeyer
29
157
0
14 Feb 2018
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search
Yelong Shen
Jianshu Chen
Po-Sen Huang
Yuqing Guo
Jianfeng Gao
29
127
0
12 Feb 2018
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With
  Expert Demonstrations
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Xiaoqin Zhang
Huimin Ma
OffRL
29
38
0
31 Jan 2018
Experience-driven Networking: A Deep Reinforcement Learning based
  Approach
Experience-driven Networking: A Deep Reinforcement Learning based Approach
Zhiyuan Xu
Jian Tang
Jingsong Meng
Weiyi Zhang
Yanzhi Wang
C. Liu
Dejun Yang
OffRL
27
359
0
17 Jan 2018
Expected Policy Gradients for Reinforcement Learning
Expected Policy Gradients for Reinforcement Learning
K. Ciosek
Shimon Whiteson
50
51
0
10 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
23
8,158
0
04 Jan 2018
RLlib: Abstractions for Distributed Reinforcement Learning
RLlib: Abstractions for Distributed Reinforcement Learning
Eric Liang
Richard Liaw
Philipp Moritz
Robert Nishihara
Roy Fox
Ken Goldberg
Joseph E. Gonzalez
Michael I. Jordan
Ion Stoica
OffRL
AI4CE
31
173
0
26 Dec 2017
Least-Squares Temporal Difference Learning for the Linear Quadratic
  Regulator
Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator
Stephen Tu
Benjamin Recht
OffRL
32
130
0
22 Dec 2017
Action Branching Architectures for Deep Reinforcement Learning
Action Branching Architectures for Deep Reinforcement Learning
Arash Tavakoli
Fabio Pardo
Petar Kormushev
22
258
0
24 Nov 2017
How Generative Adversarial Networks and Their Variants Work: An Overview
How Generative Adversarial Networks and Their Variants Work: An Overview
Yongjun Hong
Uiwon Hwang
Jaeyoon Yoo
Sungroh Yoon
GAN
41
153
0
16 Nov 2017
Costate-focused models for reinforcement learning
Costate-focused models for reinforcement learning
B. Behrouzi
Xuefei Liu
D. Tweed
OffRL
18
0
0
15 Nov 2017
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
56
300
0
31 Oct 2017
Action-depedent Control Variates for Policy Optimization via Stein's
  Identity
Action-depedent Control Variates for Policy Optimization via Stein's Identity
Hao Liu
Yihao Feng
Yi Mao
Dengyong Zhou
Jian-wei Peng
Qiang Liu
26
4
0
30 Oct 2017
On- and Off-Policy Monotonic Policy Improvement
On- and Off-Policy Monotonic Policy Improvement
R. Iwaki
Minoru Asada
OffRL
20
0
0
10 Oct 2017
Local Communication Protocols for Learning Complex Swarm Behaviors with
  Deep Reinforcement Learning
Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning
Maximilian Hüttenrauch
Adrian Šošić
Gerhard Neumann
6
3
0
21 Sep 2017
Deep Reinforcement Learning that Matters
Deep Reinforcement Learning that Matters
Peter Henderson
Riashat Islam
Philip Bachman
Joelle Pineau
Doina Precup
D. Meger
OffRL
45
1,932
0
19 Sep 2017
Mean Actor Critic
Mean Actor Critic
Cameron Allen
Kavosh Asadi
Melrose Roderick
Abdel-rahman Mohamed
George Konidaris
Michael Littman
20
44
0
01 Sep 2017
A Brief Survey of Deep Reinforcement Learning
A Brief Survey of Deep Reinforcement Learning
Kai Arulkumaran
M. Deisenroth
Miles Brundage
Anil Anthony Bharath
OffRL
59
2,776
0
19 Aug 2017
Scalable trust-region method for deep reinforcement learning using
  Kronecker-factored approximation
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
Yuhuai Wu
Elman Mansimov
Shun Liao
Roger C. Grosse
Jimmy Ba
OffRL
22
622
0
17 Aug 2017
Benchmark Environments for Multitask Learning in Continuous Domains
Benchmark Environments for Multitask Learning in Continuous Domains
Peter Henderson
Wei-Di Chang
Florian Shkurti
Johanna Hansen
D. Meger
Gregory Dudek
6
40
0
14 Aug 2017
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for
  Continuous Control
Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control
Riashat Islam
Peter Henderson
Maziar Gomrokchi
Doina Precup
BDL
OffRL
8
251
0
10 Aug 2017
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with
  Model-Free Fine-Tuning
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning
Anusha Nagabandi
G. Kahn
R. Fearing
Sergey Levine
9
965
0
08 Aug 2017
Deep Reinforcement Learning Attention Selection for Person
  Re-Identification
Deep Reinforcement Learning Attention Selection for Person Re-Identification
Xu Lan
Hanxiao Wang
S. Gong
Xiatian Zhu
18
6
0
10 Jul 2017
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Trust-PCL: An Off-Policy Trust Region Method for Continuous Control
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
19
106
0
06 Jul 2017
Expected Policy Gradients
Expected Policy Gradients
K. Ciosek
Shimon Whiteson
17
57
0
15 Jun 2017
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient
  Estimation for Deep Reinforcement Learning
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Bernhard Schölkopf
Sergey Levine
OffRL
21
164
0
01 Jun 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
9
1,301
0
30 May 2017
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Learning End-to-end Multimodal Sensor Policies for Autonomous Navigation
Guan-Horng Liu
Avinash Siravuru
Sai P. Selvaraj
Manuela Veloso
George Kantor
8
69
0
30 May 2017
Enhanced Experience Replay Generation for Efficient Reinforcement
  Learning
Enhanced Experience Replay Generation for Efficient Reinforcement Learning
Vincent Huang
Tobias Ley
Martha Vlachou-Konchylaki
Wenfeng Hu
OnRL
GAN
SyDa
13
9
0
23 May 2017
Stein Variational Policy Gradient
Stein Variational Policy Gradient
Yang Liu
Prajit Ramachandran
Qiang Liu
Jian-wei Peng
14
138
0
07 Apr 2017
Learning Combinatorial Optimization Algorithms over Graphs
Learning Combinatorial Optimization Algorithms over Graphs
H. Dai
Elias Boutros Khalil
Yuyu Zhang
B. Dilkina
Le Song
26
1,444
0
05 Apr 2017
Learning to Navigate Cloth using Haptics
Learning to Navigate Cloth using Haptics
Alexander Clegg
Wenhao Yu
Zackory M. Erickson
Jie Tan
Chenxi Liu
Greg Turk
18
23
0
20 Mar 2017
Towards Generalization and Simplicity in Continuous Control
Towards Generalization and Simplicity in Continuous Control
Aravind Rajeswaran
Kendall Lowrey
E. Todorov
Sham Kakade
OffRL
26
276
0
08 Mar 2017
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir Nachum
Mohammad Norouzi
Kelvin Xu
Dale Schuurmans
15
465
0
28 Feb 2017
Reinforcement Learning with Deep Energy-Based Policies
Reinforcement Learning with Deep Energy-Based Policies
Tuomas Haarnoja
Haoran Tang
Pieter Abbeel
Sergey Levine
26
1,314
0
27 Feb 2017
Preparing for the Unknown: Learning a Universal Policy with Online
  System Identification
Preparing for the Unknown: Learning a Universal Policy with Online System Identification
Wenhao Yu
Jie Tan
Chenxi Liu
Greg Turk
OffRL
15
305
0
08 Feb 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRL
VLM
104
1,503
0
25 Jan 2017
Input Convex Neural Networks
Input Convex Neural Networks
Brandon Amos
Lei Xu
J. Zico Kolter
187
603
0
22 Sep 2016
Previous
1234