Energy-Based Transfer for Reinforcement Learning

19 June 2025

Main:9 Pages

9 Figures

Bibliography:2 Pages

6 Tables

Appendix:4 Pages

Abstract

Reinforcement learning algorithms often suffer from poor sample efficiency, making them challenging to apply in multi-task or continual learning settings. Efficiency can be improved by transferring knowledge from a previously trained teacher policy to guide exploration in new but related tasks. However, if the new task sufficiently differs from the teacher's training task, the transferred guidance may be sub-optimal and bias exploration toward low-reward behaviors. We propose an energy-based transfer learning method that uses out-of-distribution detection to selectively issue guidance, enabling the teacher to intervene only in states within its training distribution. We theoretically show that energy scores reflect the teacher's state-visitation density and empirically demonstrate improved sample efficiency and performance across both single-task and multi-task settings.

View on arXiv

@article{deng2025_2506.16590,
  title={ Energy-Based Transfer for Reinforcement Learning },
  author={ Zeyun Deng and Jasorsi Ghosh and Fiona Xie and Yuzhe Lu and Katia Sycara and Joseph Campbell },
  journal={arXiv preprint arXiv:2506.16590},
  year={ 2025 }
}

Comments on this paper