Ghost Policies: A New Paradigm for Understanding and Learning from Failure in Deep Reinforcement Learning

14 June 2025

Xabier Olaz

ArXiv (abs)PDF HTML

Main:3 Pages

1 Figures

Bibliography:2 Pages

1 Tables

Abstract

Deep Reinforcement Learning (DRL) agents often exhibit intricate failure modes that are difficult to understand, debug, and learn from. This opacity hinders their reliable deployment in real-world applications. To address this critical gap, we introduce ``Ghost Policies,'' a concept materialized through Arvolution, a novel Augmented Reality (AR) framework. Arvolution renders an agent's historical failed policy trajectories as semi-transparent ``ghosts'' that coexist spatially and temporally with the active agent, enabling an intuitive visualization of policy divergence. Arvolution uniquely integrates: (1) AR visualization of ghost policies, (2) a behavioural taxonomy of DRL maladaptation, (3) a protocol for systematic human disruption to scientifically study failure, and (4) a dual-learning loop where both humans and agents learn from these visualized failures. We propose a paradigm shift, transforming DRL agent failures from opaque, costly errors into invaluable, actionable learning resources, laying the groundwork for a new research field: ``Failure Visualization Learning.''

View on arXiv

@article{olaz2025_2506.12366,
  title={ Ghost Policies: A New Paradigm for Understanding and Learning from Failure in Deep Reinforcement Learning },
  author={ Xabier Olaz },
  journal={arXiv preprint arXiv:2506.12366},
  year={ 2025 }
}

Comments on this paper