ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.01985
  4. Cited By
Variance Penalized On-Policy and Off-Policy Actor-Critic

Variance Penalized On-Policy and Off-Policy Actor-Critic

3 February 2021
Arushi Jain
Gandharv Patil
Ayush Jain
Khimya Khetarpal
Doina Precup
    OffRL
ArXiv (abs)PDFHTML

Papers citing "Variance Penalized On-Policy and Off-Policy Actor-Critic"

16 / 16 papers shown
Title
Risk-Averse Trust Region Optimization for Reward-Volatility Reduction
Risk-Averse Trust Region Optimization for Reward-Volatility Reduction
Qianggang Ding
Sifan Wu
Hao Sun
Jiadong Guo
Jian Guo
41
127
0
06 Dec 2019
Entropic Risk Measure in Policy Search
Entropic Risk Measure in Policy Search
David Nass
Boris Belousov
Jan Peters
73
28
0
21 Jun 2019
Reward Constrained Policy Optimization
Reward Constrained Policy Optimization
Chen Tessler
D. Mankowitz
Shie Mannor
86
544
0
28 May 2018
A Distributional Perspective on Reinforcement Learning
A Distributional Perspective on Reinforcement Learning
Marc G. Bellemare
Will Dabney
Rémi Munos
OffRL
101
1,506
0
21 Jul 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
547
19,296
0
20 Jul 2017
A Greedy Approach to Adapting the Trace Parameter for Temporal
  Difference Learning
A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning
Martha White
Adam White
65
48
0
02 Jul 2016
Safe and Efficient Off-Policy Reinforcement Learning
Safe and Efficient Off-Policy Reinforcement Learning
Rémi Munos
T. Stepleton
Anna Harutyunyan
Marc G. Bellemare
OffRL
138
618
0
08 Jun 2016
OpenAI Gym
OpenAI Gym
Greg Brockman
Vicki Cheung
Ludwig Pettersson
Jonas Schneider
John Schulman
Jie Tang
Wojciech Zaremba
OffRLODL
225
5,087
0
05 Jun 2016
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
133
3,439
0
08 Jun 2015
Algorithms for CVaR Optimization in MDPs
Algorithms for CVaR Optimization in MDPs
Yinlam Chow
Mohammad Ghavamzadeh
102
201
0
12 Jun 2014
Optimizing the CVaR via Sampling
Optimizing the CVaR via Sampling
Aviv Tamar
Yonatan Glassner
Shie Mannor
94
186
0
15 Apr 2014
Variance-Constrained Actor-Critic Algorithms for Discounted and Average
  Reward MDPs
Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs
Prashanth L.A.
Mohammad Ghavamzadeh
69
70
0
25 Mar 2014
Risk-sensitive Reinforcement Learning
Risk-sensitive Reinforcement Learning
Yun Shen
Michael J. Tobia
T. Sommer
Klaus Obermayer
98
320
0
08 Nov 2013
Variance Adjusted Actor Critic Algorithms
Variance Adjusted Actor Critic Algorithms
Aviv Tamar
Shie Mannor
OffRL
89
43
0
14 Oct 2013
Policy Gradients with Variance Related Risk Criteria
Policy Gradients with Variance Related Risk Criteria
Dotan Di Castro
Aviv Tamar
Shie Mannor
101
211
0
27 Jun 2012
Parametric Return Density Estimation for Reinforcement Learning
Parametric Return Density Estimation for Reinforcement Learning
Tetsuro Morimura
Masashi Sugiyama
H. Kashima
Hirotaka Hachiya
Toshiyuki Tanaka
85
112
0
15 Mar 2012
1