Online Reinforcement Learning in Non-Stationary Context-Driven Environments
International Conference on Learning Representations (ICLR), 2023
- CLLOffRL
Main:10 Pages
10 Figures
Bibliography:8 Pages
12 Tables
Appendix:17 Pages
Abstract
We study online reinforcement learning (RL) in non-stationary environments, where a time-varying exogenous context process affects the environment dynamics. Online RL is challenging in such environments due to "catastrophic forgetting" (CF). The agent tends to forget prior knowledge as it trains on new experiences. Prior approaches to mitigate this issue assume task labels (which are often not available in practice), employ brittle regularization heuristics or use off-policy methods that suffer from instability and poor performance.
View on arXivComments on this paper
