ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.12156
  4. Cited By
Convergent Policy Optimization for Safe Reinforcement Learning

Convergent Policy Optimization for Safe Reinforcement Learning

26 October 2019
Ming Yu
Zhuoran Yang
Mladen Kolar
Zhaoran Wang
ArXiv (abs)PDFHTML

Papers citing "Convergent Policy Optimization for Safe Reinforcement Learning"

35 / 35 papers shown
Title
When to Localize? A Risk-Constrained Reinforcement Learning Approach
When to Localize? A Risk-Constrained Reinforcement Learning Approach
Chak Lam Shek
Kasra Torshizi
Troi Williams
Pratap Tokekar
92
2
0
05 Nov 2024
GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model
GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model
Zhehua Zhou
Xuan Xie
Jiayang Song
Zhan Shu
Lei Ma
91
1
0
06 Jun 2024
Neural Policy Gradient Methods: Global Optimality and Rates of
  Convergence
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
85
241
0
29 Aug 2019
On the Global Convergence of Actor-Critic: A Case for Linear Quadratic
  Regulator with Ergodic Cost
On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost
Zhuoran Yang
Yongxin Chen
Mingyi Hong
Zhaoran Wang
94
40
0
14 Jul 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally
  Optimal Policy
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
73
111
0
25 Jun 2019
Provably Efficient Q-Learning with Low Switching Cost
Provably Efficient Q-Learning with Low Switching Cost
Yu Bai
Tengyang Xie
Nan Jiang
Yu Wang
69
93
0
30 May 2019
Communication-Efficient Policy Gradient Methods for Distributed
  Reinforcement Learning
Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning
Tianyi Chen
Kai Zhang
G. Giannakis
Tamer Basar
OffRL
51
41
0
07 Dec 2018
Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement
  Learning With Networked Agents
Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents
Kai Zhang
Zhuoran Yang
Han Liu
Tong Zhang
Tamer Basar
OffRL
59
26
0
06 Dec 2018
Macro action selection with deep reinforcement learning in StarCraft
Macro action selection with deep reinforcement learning in StarCraft
Sijia Xu
Hongyu Kuang
Zhi Zhuang
Renjie Hu
Yang Liu
Huyang Sun
57
28
0
02 Dec 2018
Modular Architecture for StarCraft II with Deep Reinforcement Learning
Modular Architecture for StarCraft II with Deep Reinforcement Learning
Dennis Lee
Mizanur Rahman
Jeffrey O. Zhang
Huazhe Xu
Jerome McClendon
Pieter Abbeel
82
56
0
08 Nov 2018
TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in
  the Full Game
TStarBots: Defeating the Cheating Level Builtin AI in StarCraft II in the Full Game
Peng Sun
Xinghai Sun
Lei Han
Jiechao Xiong
Qing Wang
...
Yang Zheng
Ji Liu
Yongsheng Liu
Han Liu
Tong Zhang
63
76
0
19 Sep 2018
Risk-Sensitive Generative Adversarial Imitation Learning
Risk-Sensitive Generative Adversarial Imitation Learning
Jonathan Lacotte
Mohammad Ghavamzadeh
Yinlam Chow
Marco Pavone
GAN
66
24
0
13 Aug 2018
A Tour of Reinforcement Learning: The View from Continuous Control
A Tour of Reinforcement Learning: The View from Continuous Control
Benjamin Recht
121
629
0
25 Jun 2018
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual
  Optimization
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
Hoi-To Wai
Zhuoran Yang
Zhaoran Wang
Mingyi Hong
68
170
0
03 Jun 2018
Reward Constrained Policy Optimization
Reward Constrained Policy Optimization
Chen Tessler
D. Mankowitz
Shie Mannor
83
541
0
28 May 2018
Learning Safe Policies with Expert Guidance
Learning Safe Policies with Expert Guidance
Je-chun Huang
Fa Wu
Doina Precup
Yang Cai
49
25
0
21 May 2018
A Lyapunov-based Approach to Safe Reinforcement Learning
A Lyapunov-based Approach to Safe Reinforcement Learning
Yinlam Chow
Ofir Nachum
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
163
506
0
20 May 2018
Global Convergence of Policy Gradient Methods for the Linear Quadratic
  Regulator
Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
Maryam Fazel
Rong Ge
Sham Kakade
M. Mesbahi
82
605
0
15 Jan 2018
On the Sample Complexity of the Linear Quadratic Regulator
On the Sample Complexity of the Linear Quadratic Regulator
Sarah Dean
Horia Mania
Nikolai Matni
Benjamin Recht
Stephen Tu
73
578
0
04 Oct 2017
StarCraft II: A New Challenge for Reinforcement Learning
StarCraft II: A New Challenge for Reinforcement Learning
Oriol Vinyals
T. Ewalds
Sergey Bartunov
Petko Georgiev
A. Vezhnevets
...
Anthony Brunasso
David Lawrence
Anders Ekermo
J. Repp
Rodney Tsing
78
874
0
16 Aug 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
113
1,325
0
30 May 2017
Safe Model-based Reinforcement Learning with Stability Guarantees
Safe Model-based Reinforcement Learning with Stability Guarantees
Felix Berkenkamp
M. Turchetta
Angela P. Schoellig
Andreas Krause
176
852
0
23 May 2017
A General Safety Framework for Learning-Based Control in Uncertain
  Robotic Systems
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
J. F. Fisac
Anayo K. Akametalu
Melanie Zeilinger
Shahab Kaynama
J. Gillula
Claire Tomlin
58
497
0
03 May 2017
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level
  Coordination in Learning to Play StarCraft Combat Games
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games
Peng Peng
Ying Wen
Yaodong Yang
Quan Yuan
Zhenkun Tang
Haitao Long
Jun Wang
65
335
0
29 Mar 2017
Deep Reinforcement Learning: An Overview
Deep Reinforcement Learning: An Overview
Yuxi Li
OffRLVLM
181
1,539
0
25 Jan 2017
Concrete Problems in AI Safety
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
236
2,389
0
21 Jun 2016
Safe Exploration in Finite Markov Decision Processes with Gaussian
  Processes
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
M. Turchetta
Felix Berkenkamp
Andreas Krause
84
189
0
15 Jun 2016
Asynchronous Methods for Deep Reinforcement Learning
Asynchronous Methods for Deep Reinforcement Learning
Volodymyr Mnih
Adria Puigdomenech Badia
M. Berk Mirza
Alex Graves
Timothy Lillicrap
Tim Harley
David Silver
Koray Kavukcuoglu
199
8,859
0
04 Feb 2016
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Yinlam Chow
Mohammad Ghavamzadeh
Lucas Janson
Marco Pavone
73
512
0
05 Dec 2015
Solving Transition-Independent Multi-agent MDPs with Sparse Interactions
  (Extended version)
Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)
Joris Scharpff
D. Roijers
F. Oliehoek
M. Spaan
Mathijs de Weerdt
38
33
0
29 Nov 2015
Massively Parallel Methods for Deep Reinforcement Learning
Massively Parallel Methods for Deep Reinforcement Learning
Arun Nair
Praveen Srinivasan
Sam Blackwell
Cagdas Alcicek
Rory Fearon
...
Stig Petersen
Shane Legg
Volodymyr Mnih
Koray Kavukcuoglu
David Silver
OffRLAI4CEGNN
96
503
0
15 Jul 2015
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear
  Regret
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret
Haitham Bou-Ammar
Rasul Tutunov
Eric Eaton
OffRLCLL
67
64
0
21 May 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
277
6,776
0
19 Feb 2015
Variance-Constrained Actor-Critic Algorithms for Discounted and Average
  Reward MDPs
Variance-Constrained Actor-Critic Algorithms for Discounted and Average Reward MDPs
Prashanth L.A.
Mohammad Ghavamzadeh
64
70
0
25 Mar 2014
The Arcade Learning Environment: An Evaluation Platform for General
  Agents
The Arcade Learning Environment: An Evaluation Platform for General Agents
Marc G. Bellemare
Yavar Naddaf
J. Veness
Michael Bowling
117
3,006
0
19 Jul 2012
1