ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1803.07246
  4. Cited By
Variance Reduction for Policy Gradient with Action-Dependent Factorized
  Baselines

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

20 March 2018
Cathy Wu
Aravind Rajeswaran
Yan Duan
Vikash Kumar
Alexandre M. Bayen
Sham Kakade
Igor Mordatch
Pieter Abbeel
    OffRL
ArXivPDFHTML

Papers citing "Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines"

34 / 34 papers shown
Title
Multi-Fidelity Policy Gradient Algorithms
Multi-Fidelity Policy Gradient Algorithms
Xinjie Liu
Cyrus Neary
Kushagra Gupta
Christian Ellis
Ufuk Topcu
David Fridovich-Keil
OffRL
235
0
0
07 Mar 2025
Orchestrating Joint Offloading and Scheduling for Low-Latency Edge SLAM
Orchestrating Joint Offloading and Scheduling for Low-Latency Edge SLAM
Yao Zhang
Yuyi Mao
Hui Wang
Zhiwen Yu
Song Guo
Jun Zhang
Liang Wang
B. Guo
48
0
0
23 Feb 2025
Distillation Policy Optimization
Distillation Policy Optimization
Jianfei Ma
OffRL
26
1
0
01 Feb 2023
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree
  Search
SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search
Gal Dalal
Assaf Hallak
Gugan Thoppe
Shie Mannor
Gal Chechik
29
3
0
30 Jan 2023
Stochastic Dimension-reduced Second-order Methods for Policy
  Optimization
Stochastic Dimension-reduced Second-order Methods for Policy Optimization
Jinsong Liu
Chen Xie
Qinwen Deng
Dongdong Ge
Yi-Li Ye
32
1
0
28 Jan 2023
The Role of Baselines in Policy Gradient Optimization
The Role of Baselines in Policy Gradient Optimization
Jincheng Mei
Wesley Chung
Valentin Thomas
Bo Dai
Csaba Szepesvári
Dale Schuurmans
29
16
0
16 Jan 2023
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural
  Policy Gradient Methods
An Improved Analysis of (Variance-Reduced) Policy Gradient and Natural Policy Gradient Methods
Yanli Liu
Kaipeng Zhang
Tamer Basar
W. Yin
48
102
0
15 Nov 2022
Hyperbolic Deep Reinforcement Learning
Hyperbolic Deep Reinforcement Learning
Edoardo Cetin
B. Chamberlain
Michael M. Bronstein
Jonathan J. Hunt
48
21
0
04 Oct 2022
GFlowNets and variational inference
GFlowNets and variational inference
Nikolay Malkin
Salem Lahlou
T. Deleu
Xu Ji
J. E. Hu
Katie Everett
Dinghuai Zhang
Yoshua Bengio
BDL
136
78
0
02 Oct 2022
Constrained Update Projection Approach to Safe Policy Optimization
Constrained Update Projection Approach to Safe Policy Optimization
Long Yang
Jiaming Ji
Juntao Dai
Linrui Zhang
Binbin Zhou
Pengfei Li
Yaodong Yang
Gang Pan
41
43
0
15 Sep 2022
Deep Reinforcement Learning for Data-Driven Adaptive Scanning in
  Ptychography
Deep Reinforcement Learning for Data-Driven Adaptive Scanning in Ptychography
M. Schloz
Johannes Müller
T. Pekin
W. V. D. Broek
C. Koch
35
7
0
29 Mar 2022
Variance Reduction based Experience Replay for Policy Optimization
Variance Reduction based Experience Replay for Policy Optimization
Hua Zheng
Wei Xie
M. Feng
OffRL
41
2
0
17 Oct 2021
Soft Actor-Critic With Integer Actions
Soft Actor-Critic With Integer Actions
Ting-Han Fan
Yubo Wang
30
12
0
17 Sep 2021
Settling the Variance of Multi-Agent Policy Gradients
Settling the Variance of Multi-Agent Policy Gradients
J. Kuba
Muning Wen
Yaodong Yang
Linghui Meng
Shangding Gu
Haifeng Zhang
D. Mguni
Jun Wang
24
59
0
19 Aug 2021
Factored Policy Gradients: Leveraging Structure for Efficient Learning
  in MOMDPs
Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs
Thomas Spooner
N. Vadori
Sumitra Ganesh
30
7
0
20 Feb 2021
Counterfactual Data Augmentation using Locally Factored Dynamics
Counterfactual Data Augmentation using Locally Factored Dynamics
Silviu Pitis
Elliot Creager
Animesh Garg
BDL
OffRL
26
85
0
06 Jul 2020
From Importance Sampling to Doubly Robust Policy Gradient
From Importance Sampling to Doubly Robust Policy Gradient
Jiawei Huang
Nan Jiang
OffRL
30
24
0
20 Oct 2019
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete
  and Continuous Control
V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control
H. F. Song
A. Abdolmaleki
Jost Tobias Springenberg
Aidan Clark
Hubert Soyer
...
Dhruva Tirumala
N. Heess
Dan Belov
Martin Riedmiller
M. Botvinick
37
121
0
26 Sep 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
31
83
0
18 Sep 2019
Trajectory-wise Control Variates for Variance Reduction in Policy
  Gradient Methods
Trajectory-wise Control Variates for Variance Reduction in Policy Gradient Methods
Ching-An Cheng
Xinyan Yan
Byron Boots
25
22
0
08 Aug 2019
Hindsight Trust Region Policy Optimization
Hindsight Trust Region Policy Optimization
Hanbo Zhang
Site Bai
Xuguang Lan
David Hsu
Nanning Zheng
38
8
0
29 Jul 2019
Deep Reinforcement Learning for Cyber Security
Deep Reinforcement Learning for Cyber Security
Thanh Thi Nguyen
Vijay Janapa Reddi
OffRL
AI4CE
10
313
0
13 Jun 2019
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient
  Backpropagation Through Categorical Variables
ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables
Mingzhang Yin
Yuguang Yue
Mingyuan Zhou
22
23
0
04 May 2019
Designing a Multi-Objective Reward Function for Creating Teams of
  Robotic Bodyguards Using Deep Reinforcement Learning
Designing a Multi-Objective Reward Function for Creating Teams of Robotic Bodyguards Using Deep Reinforcement Learning
Hassam Sheikh
Ladislau Bölöni
15
3
0
28 Jan 2019
TD-Regularized Actor-Critic Methods
TD-Regularized Actor-Critic Methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
OffRL
30
32
0
19 Dec 2018
Actor-Critic Policy Optimization in Partially Observable Multiagent
  Environments
Actor-Critic Policy Optimization in Partially Observable Multiagent Environments
S. Srinivasan
Marc Lanctot
V. Zambaldi
Julien Perolat
K. Tuyls
Rémi Munos
Michael Bowling
8
148
0
21 Oct 2018
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement
  Learning
CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning
Jiachen Yang
A. Nakhaei
David Isele
K. Fujimura
H. Zha
29
75
0
13 Sep 2018
Variance Reduction in Monte Carlo Counterfactual Regret Minimization
  (VR-MCCFR) for Extensive Form Games using Baselines
Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines
Martin Schmid
Neil Burch
Marc Lanctot
Matej Moravcík
Rudolf Kadlec
Michael Bowling
29
64
0
09 Sep 2018
Variance Reduction for Reinforcement Learning in Input-Driven
  Environments
Variance Reduction for Reinforcement Learning in Input-Driven Environments
Hongzi Mao
S. Venkatakrishnan
Malte Schwarzkopf
Mohammad Alizadeh
OffRL
41
95
0
06 Jul 2018
Stochastic Variance-Reduced Policy Gradient
Stochastic Variance-Reduced Policy Gradient
Matteo Papini
Damiano Binaghi
Giuseppe Canonaco
Matteo Pirotta
Marcello Restelli
19
174
0
14 Jun 2018
Policy Optimization with Second-Order Advantage Information
Policy Optimization with Second-Order Advantage Information
Jiajin Li
Baoxiang Wang
22
6
0
09 May 2018
The Mirage of Action-Dependent Baselines in Reinforcement Learning
The Mirage of Action-Dependent Baselines in Reinforcement Learning
George Tucker
Surya Bhupatiraju
S. Gu
Richard Turner
Zoubin Ghahramani
Sergey Levine
OffRL
30
126
0
27 Feb 2018
Expected Policy Gradients for Reinforcement Learning
Expected Policy Gradients for Reinforcement Learning
K. Ciosek
Shimon Whiteson
50
51
0
10 Jan 2018
Backpropagation through the Void: Optimizing control variates for
  black-box gradient estimation
Backpropagation through the Void: Optimizing control variates for black-box gradient estimation
Will Grathwohl
Dami Choi
Yuhuai Wu
Geoffrey Roeder
David Duvenaud
56
300
0
31 Oct 2017
1