ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1611.02247
  4. Cited By
Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

7 November 2016
S. Gu
Timothy Lillicrap
Zoubin Ghahramani
Richard Turner
Sergey Levine
    OffRL
    BDL
ArXivPDFHTML

Papers citing "Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic"

50 / 196 papers shown
Title
DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances
DOB-Net: Actively Rejecting Unknown Excessive Time-Varying Disturbances
Tianming Wang
Wenjie Lu
Zheng Yan
Dikai Liu
49
4
0
10 Jul 2019
FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control
Longxiang Shi
Shijian Li
LongBing Cao
Long Yang
Gang Zheng
Gang Pan
14
5
0
01 Jul 2019
Policy Optimization with Stochastic Mirror Descent
Policy Optimization with Stochastic Mirror Descent
Long Yang
Yu Zhang
Gang Zheng
Qian Zheng
Pengfei Li
Jianhang Huang
Jun Wen
Gang Pan
31
34
0
25 Jun 2019
Ranking Policy Gradient
Ranking Policy Gradient
Kaixiang Lin
Jiayu Zhou
OffRL
19
7
0
24 Jun 2019
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy
  Gradient
An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient
Pan Xu
F. Gao
Quanquan Gu
8
93
0
29 May 2019
Smoothing Policies and Safe Policy Gradients
Smoothing Policies and Safe Policy Gradients
Matteo Papini
Matteo Pirotta
Marcello Restelli
26
29
0
08 May 2019
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient
  Reinforcement Learning
Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
Seungyul Han
Y. Sung
OffRL
6
20
0
07 May 2019
P3O: Policy-on Policy-off Policy Optimization
P3O: Policy-on Policy-off Policy Optimization
Rasool Fakoor
Pratik Chaudhari
Alex Smola
OffRL
29
51
0
05 May 2019
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient
  Backpropagation Through Categorical Variables
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables
Mingzhang Yin
Yuguang Yue
Mingyuan Zhou
16
23
0
04 May 2019
Off-Policy Policy Gradient with State Distribution Correction
Off-Policy Policy Gradient with State Distribution Correction
Yao Liu
Adith Swaminathan
Alekh Agarwal
Emma Brunskill
OffRL
11
67
0
17 Apr 2019
Decoupled Data Based Approach for Learning to Control Nonlinear
  Dynamical Systems
Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems
Ran A. Wang
Karthikeya S. Parunandi
Dan Yu
D. Kalathil
S. Chakravorty
23
11
0
17 Apr 2019
A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed
  Reinforcement Learning
A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning
Wesley A Suttle
Zhuoran Yang
Kaipeng Zhang
Zhaoran Wang
Tamer Basar
Ji Liu
OffRL
10
62
0
15 Mar 2019
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy
  Critics
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Denis Steckelmacher
Hélène Plisnier
D. Roijers
A. Nowé
OffRL
23
17
0
11 Mar 2019
Soft Actor-Critic Algorithms and Applications
Soft Actor-Critic Algorithms and Applications
Tuomas Haarnoja
Aurick Zhou
Kristian Hartikainen
George Tucker
Sehoon Ha
...
Vikash Kumar
Henry Zhu
Abhishek Gupta
Pieter Abbeel
Sergey Levine
18
2,367
0
13 Dec 2018
KF-LAX: Kronecker-factored curvature estimation for control variate
  optimization in reinforcement learning
KF-LAX: Kronecker-factored curvature estimation for control variate optimization in reinforcement learning
Mohammad Firouzi
16
0
0
11 Dec 2018
An Introduction to Deep Reinforcement Learning
An Introduction to Deep Reinforcement Learning
Vincent François-Lavet
Peter Henderson
Riashat Islam
Marc G. Bellemare
Joelle Pineau
OffRL
AI4CE
82
1,234
0
30 Nov 2018
An Off-policy Policy Gradient Theorem Using Emphatic Weightings
An Off-policy Policy Gradient Theorem Using Emphatic Weightings
Ehsan Imani
Eric Graves
Martha White
OffRL
17
71
0
22 Nov 2018
Reward-estimation variance elimination in sequential decision processes
Reward-estimation variance elimination in sequential decision processes
S. Pankov
11
5
0
15 Nov 2018
Importance Weighted Evolution Strategies
Importance Weighted Evolution Strategies
Victor Campos
Xavier Giró-i-Nieto
Jordi Torres
19
1
0
12 Nov 2018
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Sample-Efficient Policy Learning based on Completely Behavior Cloning
Qiming Zou
Ling Wang
K. Lu
Yu Li
OffRL
19
0
0
09 Nov 2018
Managing engineering systems with large state and action spaces through
  deep reinforcement learning
Managing engineering systems with large state and action spaces through deep reinforcement learning
Varun Chandrasekaran
K. Papakonstantinou
AI4CE
11
161
0
05 Nov 2018
VIREL: A Variational Inference Framework for Reinforcement Learning
VIREL: A Variational Inference Framework for Reinforcement Learning
M. Fellows
Anuj Mahajan
Tim G. J. Rudner
Shimon Whiteson
DRL
35
54
0
03 Nov 2018
Differentiable MPC for End-to-end Planning and Control
Differentiable MPC for End-to-end Planning and Control
Brandon Amos
I. D. Rodriguez
Jacob Sacks
Byron Boots
J. Zico Kolter
30
366
0
31 Oct 2018
Relative Importance Sampling For Off-Policy Actor-Critic in Deep
  Reinforcement Learning
Relative Importance Sampling For Off-Policy Actor-Critic in Deep Reinforcement Learning
Mahammad Humayoo
Xueqi Cheng
BDL
OffRL
14
5
0
30 Oct 2018
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy
  Improvement
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
Samuel Neumann
Sungsu Lim
A. Joseph
Yangchen Pan
Adam White
Martha White
28
7
0
22 Oct 2018
Optimization of Molecules via Deep Reinforcement Learning
Optimization of Molecules via Deep Reinforcement Learning
Zhenpeng Zhou
S. Kearnes
Li Li
R. Zare
Patrick F. Riley
AI4CE
27
533
0
19 Oct 2018
Using Deep Reinforcement Learning for the Continuous Control of Robotic
  Arms
Using Deep Reinforcement Learning for the Continuous Control of Robotic Arms
Winfried Lötzsch
20
3
0
15 Oct 2018
Deep Reinforcement Learning
Deep Reinforcement Learning
Yuxi Li
VLM
OffRL
28
144
0
15 Oct 2018
A Survey and Critique of Multiagent Deep Reinforcement Learning
A Survey and Critique of Multiagent Deep Reinforcement Learning
Pablo Hernandez-Leal
Bilal Kartal
Matthew E. Taylor
OffRL
43
551
0
12 Oct 2018
Neural Approaches to Conversational AI
Neural Approaches to Conversational AI
Jianfeng Gao
Michel Galley
Lihong Li
46
670
0
21 Sep 2018
Sim-to-Real Transfer of Robot Learning with Variable Length Inputs
Sim-to-Real Transfer of Robot Learning with Variable Length Inputs
Vibhavari Dasagi
Robert Lee
Serena Mou
Jake Bruce
Niko Sünderhauf
Jurgen Leitner
OffRL
25
3
0
20 Sep 2018
ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning
  Models
ANS: Adaptive Network Scaling for Deep Rectifier Reinforcement Learning Models
Yueh-hua Wu
Fan-Yun Sun
Yen-Yu Chang
Shou-De Lin
12
5
0
06 Sep 2018
ARCHER: Aggressive Rewards to Counter bias in Hindsight Experience
  Replay
ARCHER: Aggressive Rewards to Counter bias in Hindsight Experience Replay
Sameera Lanka
Tianfu Wu
28
30
0
06 Sep 2018
Sample-Efficient Imitation Learning via Generative Adversarial Nets
Sample-Efficient Imitation Learning via Generative Adversarial Nets
Lionel Blondé
Alexandros Kalousis
GAN
8
47
0
06 Sep 2018
Policy Optimization as Wasserstein Gradient Flows
Policy Optimization as Wasserstein Gradient Flows
Ruiyi Zhang
Changyou Chen
Chunyuan Li
Lawrence Carin
14
66
0
09 Aug 2018
Backprop-Q: Generalized Backpropagation for Stochastic Computation
  Graphs
Backprop-Q: Generalized Backpropagation for Stochastic Computation Graphs
Xiaoran Xu
Songpeng Zu
Yuan Zhang
Hanning Zhou
Wei Feng
BDL
13
4
0
25 Jul 2018
Remember and Forget for Experience Replay
Remember and Forget for Experience Replay
G. Novati
Petros Koumoutsakos
OffRL
35
90
0
16 Jul 2018
Variance Reduction for Reinforcement Learning in Input-Driven
  Environments
Variance Reduction for Reinforcement Learning in Input-Driven Environments
Hongzi Mao
S. Venkatakrishnan
Malte Schwarzkopf
Mohammad Alizadeh
OffRL
41
95
0
06 Jul 2018
Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value
  Expansion
Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
Jacob Buckman
Danijar Hafner
George Tucker
E. Brevdo
Honglak Lee
22
328
0
04 Jul 2018
Maximum a Posteriori Policy Optimisation
Maximum a Posteriori Policy Optimisation
A. Abdolmaleki
Jost Tobias Springenberg
Yuval Tassa
Rémi Munos
N. Heess
Martin Riedmiller
48
470
0
14 Jun 2018
Qualitative Measurements of Policy Discrepancy for Return-Based Deep
  Q-Network
Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network
Wenjia Meng
Qian Zheng
L. Yang
Pengfei Li
Gang Pan
20
21
0
14 Jun 2018
Learning convex bounds for linear quadratic control policy synthesis
Learning convex bounds for linear quadratic control policy synthesis
Jack Umenberger
Thomas B. Schon
24
12
0
01 Jun 2018
Reparameterization Gradient for Non-differentiable Models
Reparameterization Gradient for Non-differentiable Models
Wonyeol Lee
Hangyeol Yu
Hongseok Yang
DRL
25
30
0
01 Jun 2018
Data-Efficient Hierarchical Reinforcement Learning
Data-Efficient Hierarchical Reinforcement Learning
Ofir Nachum
S. Gu
Honglak Lee
Sergey Levine
OffRL
53
797
0
21 May 2018
Policy Optimization with Second-Order Advantage Information
Policy Optimization with Second-Order Advantage Information
Jiajin Li
Baoxiang Wang
22
6
0
09 May 2018
Deep Reinforcement Learning for Playing 2.5D Fighting Games
Deep Reinforcement Learning for Playing 2.5D Fighting Games
Yu-Jhe Li
Hsin-Yu Chang
Yu-Jing Lin
Po-Wei Wu
Y. Wang
GAN
11
5
0
05 May 2018
Recall Traces: Backtracking Models for Efficient Reinforcement Learning
Recall Traces: Backtracking Models for Efficient Reinforcement Learning
Anirudh Goyal
Philemon Brakel
W. Fedus
Soumye Singhal
Timothy Lillicrap
Sergey Levine
Hugo Larochelle
Yoshua Bengio
OffRL
23
68
0
02 Apr 2018
Variance Reduction for Policy Gradient with Action-Dependent Factorized
  Baselines
Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines
Cathy Wu
Aravind Rajeswaran
Yan Duan
Vikash Kumar
Alexandre M. Bayen
Sham Kakade
Igor Mordatch
Pieter Abbeel
OffRL
14
150
0
20 Mar 2018
Simple random search provides a competitive approach to reinforcement
  learning
Simple random search provides a competitive approach to reinforcement learning
Horia Mania
Aurelia Guy
Benjamin Recht
20
315
0
19 Mar 2018
Policy Search in Continuous Action Domains: an Overview
Policy Search in Continuous Action Domains: an Overview
Olivier Sigaud
F. Stulp
16
72
0
13 Mar 2018
Previous
1234
Next