ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2506.00700
20
0

Central Path Proximal Policy Optimization

31 May 2025
Nikola Milosevic
Johannes Müller
Nico Scherf
ArXiv (abs)PDFHTML
Main:7 Pages
6 Figures
Bibliography:2 Pages
1 Tables
Appendix:5 Pages
Abstract

In constrained Markov decision processes, enforcing constraints during training is often thought of as decreasing the final return. Recently, it was shown that constraints can be incorporated directly in the policy geometry, yielding an optimization trajectory close to the central path of a barrier method, which does not compromise final return. Building on this idea, we introduce Central Path Proximal Policy Optimization (C3PO), a simple modification of PPO that produces policy iterates, which stay close to the central path of the constrained optimization problem. Compared to existing on-policy methods, C3PO delivers improved performance with tighter constraint enforcement, suggesting that central path-guided updates offer a promising direction for constrained policy optimization.

View on arXiv
@article{milosevic2025_2506.00700,
  title={ Central Path Proximal Policy Optimization },
  author={ Nikola Milosevic and Johannes Müller and Nico Scherf },
  journal={arXiv preprint arXiv:2506.00700},
  year={ 2025 }
}
Comments on this paper