Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.11814
Cited By
v1
v2 (latest)
Penalized Proximal Policy Optimization for Safe Reinforcement Learning
24 May 2022
Linrui Zhang
Li Shen
Long Yang
Shi-Yong Chen
Bo Yuan
Xueqian Wang
Dacheng Tao
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Penalized Proximal Policy Optimization for Safe Reinforcement Learning"
16 / 16 papers shown
Title
Embedding Safety into RL: A New Take on Trust Region Methods
Nikola Milosevic
Johannes Müller
Nico Scherf
90
2
0
05 Nov 2024
The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games
Chao Yu
Akash Velu
Eugene Vinitsky
Jiaxuan Gao
Yu Wang
Alexandre M. Bayen
Yi Wu
OffRL
134
1,252
0
02 Mar 2021
Projection-Based Constrained Policy Optimization
Tsung-Yen Yang
Justinian P. Rosca
Karthik Narasimhan
Peter J. Ramadge
38
241
0
07 Oct 2020
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods
Adam Stooke
Joshua Achiam
Pieter Abbeel
76
299
0
08 Jul 2020
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
77
165
0
01 Mar 2020
Constrained Reinforcement Learning Has Zero Duality Gap
Santiago Paternain
Luiz F. O. Chamon
Miguel Calvo-Fullana
Alejandro Ribeiro
54
193
0
29 Oct 2019
IPO: Interior-point Policy Optimization under Constraints
Yongshuai Liu
J. Ding
Xin Liu
85
181
0
21 Oct 2019
Reward Constrained Policy Optimization
Chen Tessler
D. Mankowitz
Shie Mannor
83
540
0
28 May 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
499
19,065
0
20 Jul 2017
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Ryan J. Lowe
Yi Wu
Aviv Tamar
J. Harb
Pieter Abbeel
Igor Mordatch
140
4,485
0
07 Jun 2017
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
113
1,325
0
30 May 2017
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Yinlam Chow
Mohammad Ghavamzadeh
Lucas Janson
Marco Pavone
73
512
0
05 Dec 2015
Doubly Robust Off-policy Value Evaluation for Reinforcement Learning
Nan Jiang
Lihong Li
OffRL
202
623
0
11 Nov 2015
End-to-End Training of Deep Visuomotor Policies
Sergey Levine
Chelsea Finn
Trevor Darrell
Pieter Abbeel
BDL
315
3,437
0
02 Apr 2015
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,776
0
19 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.8K
150,115
0
22 Dec 2014
1