42
0
v1v2 (latest)

Bregman Centroid Guided Cross-Entropy Method

Main:8 Pages
8 Figures
Bibliography:2 Pages
4 Tables
Appendix:5 Pages
Abstract

The Cross-Entropy Method (CEM) is a widely adopted trajectory optimizer in model-based reinforcement learning (MBRL), but its unimodal sampling strategy often leads to premature convergence in multimodal landscapes. In this work, we propose Bregman Centroid Guided CEM (BC\mathcal{BC}-EvoCEM), a lightweight enhancement to ensemble CEM that leverages Bregman centroids\textit{Bregman centroids} for principled information aggregation and diversity control. \textbf{\mathcal{BC}-EvoCEM} computes a performance-weighted Bregman centroid across CEM workers and updates the least contributing ones by sampling within a trust region around the centroid. Leveraging the duality between Bregman divergences and exponential family distributions, we show that \textbf{\mathcal{BC}-EvoCEM} integrates seamlessly into standard CEM pipelines with negligible overhead. Empirical results on synthetic benchmarks, a cluttered navigation task, and full MBRL pipelines demonstrate that \textbf{\mathcal{BC}-EvoCEM} enhances both convergence and solution quality, providing a simple yet effective upgrade for CEM.

View on arXiv
Comments on this paper