ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.10330
  4. Cited By
A Review of Safe Reinforcement Learning: Methods, Theory and
  Applications

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

20 May 2022
Shangding Gu
Longyu Yang
Yali Du
Guang Chen
Florian Walter
Jun Wang
Alois C. Knoll
    OffRL
    AI4TS
ArXivPDFHTML

Papers citing "A Review of Safe Reinforcement Learning: Methods, Theory and Applications"

50 / 150 papers shown
Title
Supervised Policy Update for Deep Reinforcement Learning
Supervised Policy Update for Deep Reinforcement Learning
Q. Vuong
Yiming Zhang
George Andriopoulos
38
20
0
29 May 2018
Reward Constrained Policy Optimization
Reward Constrained Policy Optimization
Chen Tessler
D. Mankowitz
Shie Mannor
61
540
0
28 May 2018
Verifiable Reinforcement Learning via Policy Extraction
Verifiable Reinforcement Learning via Policy Extraction
Osbert Bastani
Yewen Pu
Armando Solar-Lezama
OffRL
109
331
0
22 May 2018
A Lyapunov-based Approach to Safe Reinforcement Learning
A Lyapunov-based Approach to Safe Reinforcement Learning
Yinlam Chow
Ofir Nachum
Edgar A. Duénez-Guzmán
Mohammad Ghavamzadeh
138
504
0
20 May 2018
AI safety via debate
AI safety via debate
G. Irving
Paul Christiano
Dario Amodei
228
211
0
02 May 2018
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based
  Character Skills
DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills
Xue Bin Peng
Pieter Abbeel
Sergey Levine
M. van de Panne
AI4CE
217
497
0
08 Apr 2018
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent
  Reinforcement Learning
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid
Mikayel Samvelyan
Christian Schroeder de Witt
Gregory Farquhar
Jakob N. Foerster
Shimon Whiteson
118
1,662
0
30 Mar 2018
Inequity aversion improves cooperation in intertemporal social dilemmas
Inequity aversion improves cooperation in intertemporal social dilemmas
Edward Hughes
Joel Z Leibo
Matthew Phillips
K. Tuyls
Edgar A. Duénez-Guzmán
...
Tina Zhu
Kevin R. McKee
Raphael Köster
H. Roff
T. Graepel
50
205
0
23 Mar 2018
Learning-based Model Predictive Control for Safe Exploration
Learning-based Model Predictive Control for Safe Exploration
Torsten Koller
Felix Berkenkamp
M. Turchetta
Andreas Krause
39
376
0
22 Mar 2018
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement
  Learning
Accelerated Primal-Dual Policy Optimization for Safe Reinforcement Learning
Qingkai Liang
Fanyu Que
E. Modiano
49
102
0
19 Feb 2018
A Unified Approach for Multi-step Temporal-Difference Learning with
  Eligibility Traces in Reinforcement Learning
A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning
Long Yang
Minhao Shi
Qian Zheng
Wenjia Meng
Gang Pan
57
24
0
09 Feb 2018
Safe Exploration in Continuous Action Spaces
Safe Exploration in Continuous Action Spaces
Gal Dalal
Krishnamurthy Dvijotham
Matej Vecerík
Todd Hester
Cosmin Paduraru
Yuval Tassa
38
438
0
26 Jan 2018
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement
  Learning with a Stochastic Actor
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
Tuomas Haarnoja
Aurick Zhou
Pieter Abbeel
Sergey Levine
219
8,236
0
04 Jan 2018
Boosting the Actor with Dual Critic
Boosting the Actor with Dual Critic
Bo Dai
Albert Eaton Shaw
Niao He
Lihong Li
Le Song
52
46
0
29 Dec 2017
OptLayer - Practical Constrained Optimization for Deep Reinforcement
  Learning in the Real World
OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World
Tu-Hoa Pham
Giovanni De Magistris
Ryuki Tachibana
OffRL
27
141
0
22 Sep 2017
Optimal and Learning Control for Autonomous Robots
Optimal and Learning Control for Autonomous Robots
J. Buchli
Farbod Farshidian
Alexander Winkler
Timothy Sandy
Markus Giftthaler
13
14
0
30 Aug 2017
Safe Reinforcement Learning via Shielding
Safe Reinforcement Learning via Shielding
Mohammed Alshiekh
Roderick Bloem
Rüdiger Ehlers
Bettina Könighofer
S. Niekum
Ufuk Topcu
63
682
0
29 Aug 2017
Proximal Policy Optimization Algorithms
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
234
18,685
0
20 Jul 2017
Data-Efficient Reinforcement Learning with Probabilistic Model
  Predictive Control
Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
Sanket Kamthe
M. Deisenroth
104
217
0
20 Jun 2017
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
Ryan J. Lowe
Yi Wu
Aviv Tamar
J. Harb
Pieter Abbeel
Igor Mordatch
116
4,441
0
07 Jun 2017
Constrained Policy Optimization
Constrained Policy Optimization
Joshua Achiam
David Held
Aviv Tamar
Pieter Abbeel
91
1,313
0
30 May 2017
Safe Model-based Reinforcement Learning with Stability Guarantees
Safe Model-based Reinforcement Learning with Stability Guarantees
Felix Berkenkamp
M. Turchetta
Angela P. Schoellig
Andreas Krause
126
845
0
23 May 2017
A General Safety Framework for Learning-Based Control in Uncertain
  Robotic Systems
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems
J. F. Fisac
Anayo K. Akametalu
Melanie Zeilinger
Shahab Kaynama
J. Gillula
Claire Tomlin
44
494
0
03 May 2017
Deep Reinforcement Learning framework for Autonomous Driving
Deep Reinforcement Learning framework for Autonomous Driving
Ahmad El-Sallab
Mohammed Abdou
E. Perot
S. Yogamani
50
968
0
08 Apr 2017
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level
  Coordination in Learning to Play StarCraft Combat Games
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games
Peng Peng
Ying Wen
Yaodong Yang
Quan Yuan
Zhenkun Tang
Haitao Long
Jun Wang
51
334
0
29 Mar 2017
Deep Deterministic Policy Gradient for Urban Traffic Light Control
Deep Deterministic Policy Gradient for Urban Traffic Light Control
Noe Casas
48
167
0
27 Mar 2017
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
Multi-agent Reinforcement Learning in Sequential Social Dilemmas
Joel Z Leibo
V. Zambaldi
Marc Lanctot
J. Marecki
T. Graepel
48
606
0
10 Feb 2017
Learning Model Predictive Control for iterative tasks. A Data-Driven
  Control Framework
Learning Model Predictive Control for iterative tasks. A Data-Driven Control Framework
Ugo Rosolia
Francesco Borrelli
32
322
0
06 Sep 2016
Concrete Problems in AI Safety
Concrete Problems in AI Safety
Dario Amodei
C. Olah
Jacob Steinhardt
Paul Christiano
John Schulman
Dandelion Mané
147
2,371
0
21 Jun 2016
Safe Exploration in Finite Markov Decision Processes with Gaussian
  Processes
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes
M. Turchetta
Felix Berkenkamp
Andreas Krause
58
186
0
15 Jun 2016
Cooperative Inverse Reinforcement Learning
Cooperative Inverse Reinforcement Learning
Dylan Hadfield-Menell
Anca Dragan
Pieter Abbeel
Stuart J. Russell
60
643
0
09 Jun 2016
Benchmarking Deep Reinforcement Learning for Continuous Control
Benchmarking Deep Reinforcement Learning for Continuous Control
Yan Duan
Xi Chen
Rein Houthooft
John Schulman
Pieter Abbeel
OffRL
66
1,689
0
22 Apr 2016
Recent Advances in Convolutional Neural Networks
Recent Advances in Convolutional Neural Networks
Jiuxiang Gu
Zhenhua Wang
Jason Kuen
Lianyang Ma
Amir Shahroudy
...
Xingxing Wang
Li Wang
Gang Wang
Jianfei Cai
Tsuhan Chen
122
5,184
0
22 Dec 2015
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Yinlam Chow
Mohammad Ghavamzadeh
Lucas Janson
Marco Pavone
55
510
0
05 Dec 2015
Continuous control with deep reinforcement learning
Continuous control with deep reinforcement learning
Timothy Lillicrap
Jonathan J. Hunt
Alexander Pritzel
N. Heess
Tom Erez
Yuval Tassa
David Silver
Daan Wierstra
191
13,174
0
09 Sep 2015
High-Dimensional Continuous Control Using Generalized Advantage
  Estimation
High-Dimensional Continuous Control Using Generalized Advantage Estimation
John Schulman
Philipp Moritz
Sergey Levine
Michael I. Jordan
Pieter Abbeel
OffRL
40
3,368
0
08 Jun 2015
Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach
Yinlam Chow
Aviv Tamar
Shie Mannor
Marco Pavone
106
317
0
06 Jun 2015
Trust Region Policy Optimization
Trust Region Policy Optimization
John Schulman
Sergey Levine
Philipp Moritz
Michael I. Jordan
Pieter Abbeel
239
6,722
0
19 Feb 2015
Optimizing the CVaR via Sampling
Optimizing the CVaR via Sampling
Aviv Tamar
Yonatan Glassner
Shie Mannor
48
186
0
15 Apr 2014
Safe Exploration of State and Action Spaces in Reinforcement Learning
Safe Exploration of State and Action Spaces in Reinforcement Learning
Javier García
Fernando Fernández
53
163
0
04 Feb 2014
Auto-Encoding Variational Bayes
Auto-Encoding Variational Bayes
Diederik P. Kingma
Max Welling
BDL
360
16,962
0
20 Dec 2013
An MDP-based Recommender System
An MDP-based Recommender System
Guy Shani
Ronen I. Brafman
David Heckerman
LRM
64
968
0
12 Dec 2012
On the Sample Complexity of Reinforcement Learning with a Generative
  Model
On the Sample Complexity of Reinforcement Learning with a Generative Model
M. G. Azar
Rémi Munos
H. Kappen
50
156
0
27 Jun 2012
Policy Gradients with Variance Related Risk Criteria
Policy Gradients with Variance Related Risk Criteria
Dotan Di Castro
Aviv Tamar
Shie Mannor
63
206
0
27 Jun 2012
Safe Exploration in Markov Decision Processes
Safe Exploration in Markov Decision Processes
T. Moldovan
Pieter Abbeel
115
308
0
22 May 2012
Regret-based Reward Elicitation for Markov Decision Processes
Regret-based Reward Elicitation for Markov Decision Processes
K. Regan
Craig Boutilier
75
86
0
09 May 2012
PAC Bounds for Discounted MDPs
PAC Bounds for Discounted MDPs
Tor Lattimore
Marcus Hutter
64
188
0
17 Feb 2012
Provably Safe and Robust Learning-Based Model Predictive Control
Provably Safe and Robust Learning-Based Model Predictive Control
A. Aswani
Humberto González
S. Shankar Sastry
Claire Tomlin
79
522
0
13 Jul 2011
Mean-Variance Optimization in Markov Decision Processes
Mean-Variance Optimization in Markov Decision Processes
Shie Mannor
J. Tsitsiklis
67
126
0
29 Apr 2011
Fast Reinforcement Learning for Energy-Efficient Wireless Communications
Fast Reinforcement Learning for Energy-Efficient Wireless Communications
Nicholas Mastronarde
M. Schaar
46
109
0
29 Sep 2010
Previous
123