Self-Composing Policies for Scalable Continual Reinforcement Learning

4 June 2025

Main:7 Pages

22 Figures

Bibliography:4 Pages

6 Tables

Appendix:18 Pages

Abstract

This work introduces a growable and modular neural network architecture that naturally avoids catastrophic forgetting and interference in continual reinforcement learning. The structure of each module allows the selective combination of previous policies along with its internal policy, accelerating the learning process on the current task. Unlike previous growing neural network approaches, we show that the number of parameters of the proposed approach grows linearly with respect to the number of tasks, and does not sacrifice plasticity to scale. Experiments conducted in benchmark continuous control and visual problems reveal that the proposed approach achieves greater knowledge transfer and performance than alternative methods.

View on arXiv

@article{malagón2025_2506.14811,
  title={ Self-Composing Policies for Scalable Continual Reinforcement Learning },
  author={ Mikel Malagón and Josu Ceberio and Jose A. Lozano },
  journal={arXiv preprint arXiv:2506.14811},
  year={ 2025 }
}

Comments on this paper