Learning Policies with Zero or Bounded Constraint Violation for
Constrained MDPs

Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs

4 June 2021

Papers citing "Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs"

16 / 16 papers shown

Title
Probabilistic Shielding for Safe Reinforcement Learning Edwin Hamel-De le Court Francesco Belardinelli Alex W. Goodall 44 0 0 09 Mar 2025
Ensemble RL through Classifier Models: Enhancing Risk-Return Trade-offs in Trading Strategies Zheli Xiong 49 0 0 23 Feb 2025
Session-Level Dynamic Ad Load Optimization using Offline Robust Reinforcement Learning Tao Liu Qi Xu Wei Shi Zhigang Hua Shuang Yang OffRL 43 0 0 09 Jan 2025
Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form Toshinori Kitamura Tadashi Kozuno Wataru Kumagai Kenta Hoshino Y. Hosoe Kazumi Kasaura Masashi Hamaya Paavo Parmas Yutaka Matsuo 72 0 0 29 Aug 2024
Structured Reinforcement Learning for Media Streaming at the Wireless Edge Archana Bura Sarat Chandra Bobbili Shreyas Rameshkumar Desik Rengarajan D. Kalathil S. Shakkottai 31 0 0 10 Apr 2024
Learning Adversarial MDPs with Stochastic Hard Constraints Francesco Emanuele Stradi Matteo Castiglioni A. Marchesi Nicola Gatti 28 4 0 06 Mar 2024
A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees Toshinori Kitamura Tadashi Kozuno Masahiro Kato Yuki Ichihara Soichiro Nishimori Akiyoshi Sannai Sho Sonoda Wataru Kumagai Yutaka Matsuo 42 2 0 31 Jan 2024
A safe exploration approach to constrained Markov decision processes Tingting Ni Maryam Kamgarpour 39 3 0 01 Dec 2023
Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need Danil Provodin Pratik Gajane Mykola Pechenizkiy M. Kaptein 39 0 0 27 Sep 2023
Constrained Reinforcement Learning via Dissipative Saddle Flow Dynamics Tianqi Zheng Pengcheng You Enrique Mallada 34 3 0 03 Dec 2022
A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP Fan Chen Junyu Zhang Zaiwen Wen OffRL 39 8 0 13 Jul 2022
Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning Ruida Zhou Tao-Wen Liu D. Kalathil P. R. Kumar Chao Tian 32 13 0 10 Jun 2022
Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints Liyu Chen R. Jain Haipeng Luo 57 25 0 31 Jan 2022
Achieving Zero Constraint Violation for Constrained Reinforcement Learning via Primal-Dual Approach Qinbo Bai Amrit Singh Bedi Mridul Agarwal Alec Koppel Vaneet Aggarwal 107 56 0 13 Sep 2021
Concave Utility Reinforcement Learning with Zero-Constraint Violations Mridul Agarwal Qinbo Bai Vaneet Aggarwal 36 12 0 12 Sep 2021
Learning in Markov Decision Processes under Constraints Rahul Singh Abhishek Gupta Ness B. Shroff 44 27 0 27 Feb 2020