ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.00534
  4. Cited By
Provably Efficient Safe Exploration via Primal-Dual Policy Optimization

Provably Efficient Safe Exploration via Primal-Dual Policy Optimization

1 March 2020
Dongsheng Ding
Xiaohan Wei
Zhuoran Yang
Zhaoran Wang
M. Jovanović
ArXivPDFHTML

Papers citing "Provably Efficient Safe Exploration via Primal-Dual Policy Optimization"

37 / 37 papers shown
Title
Primal-Dual Sample Complexity Bounds for Constrained Markov Decision Processes with Multiple Constraints
Max Buckley
Konstantinos Papathanasiou
Andreas Spanopoulos
55
0
0
09 Mar 2025
GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model
GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model
Zhehua Zhou
Xuan Xie
Jiayang Song
Zhan Shu
Lei Ma
47
1
0
06 Jun 2024
Constrained Reinforcement Learning Under Model Mismatch
Constrained Reinforcement Learning Under Model Mismatch
Zhongchang Sun
Sihong He
Fei Miao
Shaofeng Zou
46
4
0
02 May 2024
Structured Reinforcement Learning for Media Streaming at the Wireless
  Edge
Structured Reinforcement Learning for Media Streaming at the Wireless Edge
Archana Bura
Sarat Chandra Bobbili
Shreyas Rameshkumar
Desik Rengarajan
D. Kalathil
S. Shakkottai
31
0
0
10 Apr 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with
  Uniform PAC Guarantees
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees
Toshinori Kitamura
Tadashi Kozuno
Masahiro Kato
Yuki Ichihara
Soichiro Nishimori
Akiyoshi Sannai
Sho Sonoda
Wataru Kumagai
Yutaka Matsuo
42
2
0
31 Jan 2024
Conservative Exploration for Policy Optimization via Off-Policy Policy
  Evaluation
Conservative Exploration for Policy Optimization via Off-Policy Policy Evaluation
Paul Daoudi
Mathias Formoso
Othman Gaizi
Achraf Azize
Evrard Garcelon
OffRL
26
0
0
24 Dec 2023
TRC: Trust Region Conditional Value at Risk for Safe Reinforcement
  Learning
TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning
Dohyeong Kim
Songhwai Oh
19
19
0
01 Dec 2023
State-Wise Safe Reinforcement Learning With Pixel Observations
State-Wise Safe Reinforcement Learning With Pixel Observations
S. Zhan
Yixuan Wang
Qingyuan Wu
Ruochen Jiao
Chao Huang
Qi Zhu
43
10
0
03 Nov 2023
Confronting Reward Model Overoptimization with Constrained RLHF
Confronting Reward Model Overoptimization with Constrained RLHF
Ted Moskovitz
Aaditya K. Singh
DJ Strouse
T. Sandholm
Ruslan Salakhutdinov
Anca D. Dragan
Stephen Marcus McAleer
36
47
0
06 Oct 2023
Provably Efficient Exploration in Constrained Reinforcement
  Learning:Posterior Sampling Is All You Need
Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need
Danil Provodin
Pratik Gajane
Mykola Pechenizkiy
M. Kaptein
39
0
0
27 Sep 2023
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via
  Pessimism
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism
Zihao Li
Zhuoran Yang
Mengdi Wang
OffRL
31
54
0
29 May 2023
Efficient Exploration Using Extra Safety Budget in Constrained Policy
  Optimization
Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization
Haotian Xu
Shengjie Wang
Zhaolei Wang
Yunzhe Zhang
Qing Zhuo
Yang Gao
Tao Zhang
18
0
0
28 Feb 2023
Pseudonorm Approachability and Applications to Regret Minimization
Pseudonorm Approachability and Applications to Regret Minimization
Christoph Dann
Yishay Mansour
M. Mohri
Jon Schneider
Balasubramanian Sivan
34
5
0
03 Feb 2023
Provable Reset-free Reinforcement Learning by No-Regret Reduction
Provable Reset-free Reinforcement Learning by No-Regret Reduction
Hoai-An Nguyen
Ching-An Cheng
OffRL
23
2
0
06 Jan 2023
Offline Policy Optimization in RL with Variance Regularizaton
Offline Policy Optimization in RL with Variance Regularizaton
Riashat Islam
Samarth Sinha
Homanga Bharadhwaj
Samin Yeasar Arnob
Zhuoran Yang
Animesh Garg
Zhaoran Wang
Lihong Li
Doina Precup
OffRL
26
0
0
29 Dec 2022
Quantile Constrained Reinforcement Learning: A Reinforcement Learning
  Framework Constraining Outage Probability
Quantile Constrained Reinforcement Learning: A Reinforcement Learning Framework Constraining Outage Probability
Whiyoung Jung
Myungsik Cho
Jongeui Park
Young-Jin Sung
35
4
0
28 Nov 2022
Flexible Attention-Based Multi-Policy Fusion for Efficient Deep
  Reinforcement Learning
Flexible Attention-Based Multi-Policy Fusion for Efficient Deep Reinforcement Learning
Zih-Yun Chiu
Yi-Lin Tuan
William Yang Wang
Michael C. Yip
OffRL
25
3
0
07 Oct 2022
Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities:
  Robustness, Safety, and Generalizability
Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability
Mengdi Xu
Zuxin Liu
Peide Huang
Wenhao Ding
Zhepeng Cen
Bo-wen Li
Ding Zhao
74
45
0
16 Sep 2022
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP
Fan Chen
Junyu Zhang
Zaiwen Wen
OffRL
39
8
0
13 Jul 2022
Safe Reinforcement Learning via Confidence-Based Filters
Safe Reinforcement Learning via Confidence-Based Filters
Sebastian Curi
Armin Lederer
Sandra Hirche
Andreas Krause
OffRL
24
4
0
04 Jul 2022
Near-Optimal Sample Complexity Bounds for Constrained MDPs
Near-Optimal Sample Complexity Bounds for Constrained MDPs
Sharan Vaswani
Lin F. Yang
Csaba Szepesvári
32
32
0
13 Jun 2022
Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Penalized Proximal Policy Optimization for Safe Reinforcement Learning
Linrui Zhang
Li Shen
Long Yang
Shi-Yong Chen
Bo Yuan
Xueqian Wang
Dacheng Tao
13
62
0
24 May 2022
Safe Reinforcement Learning Using Black-Box Reachability Analysis
Safe Reinforcement Learning Using Black-Box Reachability Analysis
Mahmoud Selim
Amr Alanwar
Shreyas Kousik
Grace Gao
Marco Pavone
Karl H. Johansson
29
33
0
15 Apr 2022
Learning Infinite-Horizon Average-Reward Markov Decision Processes with
  Constraints
Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints
Liyu Chen
R. Jain
Haipeng Luo
57
25
0
31 Jan 2022
Model-Based Safe Reinforcement Learning with Time-Varying State and
  Control Constraints: An Application to Intelligent Vehicles
Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles
Xinglong Zhang
Yaoqian Peng
Biao Luo
Wei Pan
Xin Xu
Haibin Xie
27
11
0
18 Dec 2021
Recent Advances in Reinforcement Learning in Finance
Recent Advances in Reinforcement Learning in Finance
B. Hambly
Renyuan Xu
Huining Yang
OffRL
27
166
0
08 Dec 2021
Safe Policy Optimization with Local Generalized Linear Function
  Approximations
Safe Policy Optimization with Local Generalized Linear Function Approximations
Akifumi Wachi
Yunyue Wei
Yanan Sui
OffRL
30
10
0
09 Nov 2021
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic
  Algorithm for Constrained Markov Decision Processes
Finite-Time Complexity of Online Primal-Dual Natural Actor-Critic Algorithm for Constrained Markov Decision Processes
Sihan Zeng
Thinh T. Doan
Justin Romberg
102
17
0
21 Oct 2021
Achieving Zero Constraint Violation for Constrained Reinforcement
  Learning via Primal-Dual Approach
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach
Qinbo Bai
Amrit Singh Bedi
Mridul Agarwal
Alec Koppel
Vaneet Aggarwal
107
56
0
13 Sep 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Concave Utility Reinforcement Learning with Zero-Constraint Violations
Mridul Agarwal
Qinbo Bai
Vaneet Aggarwal
36
12
0
12 Sep 2021
Safe Reinforcement Learning Using Advantage-Based Intervention
Safe Reinforcement Learning Using Advantage-Based Intervention
Nolan Wagener
Byron Boots
Ching-An Cheng
29
52
0
16 Jun 2021
Learning Policies with Zero or Bounded Constraint Violation for
  Constrained MDPs
Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs
Tao-Wen Liu
Ruida Zhou
D. Kalathil
P. R. Kumar
Chao Tian
29
78
0
04 Jun 2021
A Provably-Efficient Model-Free Algorithm for Constrained Markov
  Decision Processes
A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes
Honghao Wei
Xin Liu
Lei Ying
19
21
0
03 Jun 2021
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in
  Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings
Ming Yin
Yu-Xiang Wang
OffRL
32
19
0
13 May 2021
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
42
121
0
11 Nov 2020
Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints
Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints
Qinbo Bai
Vaneet Aggarwal
Ather Gattami
14
7
0
11 Mar 2020
Optimism in Reinforcement Learning with Generalized Linear Function
  Approximation
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Yining Wang
Ruosong Wang
S. Du
A. Krishnamurthy
135
135
0
09 Dec 2019
1