ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.12568
31
0

Evolutionary Policy Optimization

17 April 2025
Zelal Su "Lain" Mustafaoglu
Keshav Pingali
Risto Miikkulainen
ArXivPDFHTML
Abstract

A key challenge in reinforcement learning (RL) is managing the exploration-exploitation trade-off without sacrificing sample efficiency. Policy gradient (PG) methods excel in exploitation through fine-grained, gradient-based optimization but often struggle with exploration due to their focus on local search. In contrast, evolutionary computation (EC) methods excel in global exploration, but lack mechanisms for exploitation. To address these limitations, this paper proposes Evolutionary Policy Optimization (EPO), a hybrid algorithm that integrates neuroevolution with policy gradient methods for policy optimization. EPO leverages the exploration capabilities of EC and the exploitation strengths of PG, offering an efficient solution to the exploration-exploitation dilemma in RL. EPO is evaluated on the Atari Pong and Breakout benchmarks. Experimental results show that EPO improves both policy quality and sample efficiency compared to standard PG and EC methods, making it effective for tasks that require both exploration and local optimization.

View on arXiv
@article{mustafaoglu2025_2504.12568,
  title={ Evolutionary Policy Optimization },
  author={ Zelal Su "Lain" Mustafaoglu and Keshav Pingali and Risto Miikkulainen },
  journal={arXiv preprint arXiv:2504.12568},
  year={ 2025 }
}
Comments on this paper