51
0

Accelerating Model-Based Reinforcement Learning using Non-Linear Trajectory Optimization

Main:5 Pages
4 Figures
Bibliography:1 Pages
Abstract

This paper addresses the slow policy optimization convergence of Monte Carlo Probabilistic Inference for Learning Control (MC-PILCO), a state-of-the-art model-based reinforcement learning (MBRL) algorithm, by integrating it with iterative Linear Quadratic Regulator (iLQR), a fast trajectory optimization method suitable for nonlinear systems. The proposed method, Exploration-Boosted MC-PILCO (EB-MC-PILCO), leverages iLQR to generate informative, exploratory trajectories and initialize the policy, significantly reducing the number of required optimization steps. Experiments on the cart-pole task demonstrate that EB-MC-PILCO accelerates convergence compared to standard MC-PILCO, achieving up to 45.9%\bm{45.9\%} reduction in execution time when both methods solve the task in four trials. EB-MC-PILCO also maintains a 100%\bm{100\%} success rate across trials while solving the task faster, even in cases where MC-PILCO converges in fewer iterations.

View on arXiv
@article{calì2025_2506.02767,
  title={ Accelerating Model-Based Reinforcement Learning using Non-Linear Trajectory Optimization },
  author={ Marco Calì and Giulio Giacomuzzo and Ruggero Carli and Alberto Dalla Libera },
  journal={arXiv preprint arXiv:2506.02767},
  year={ 2025 }
}
Comments on this paper