Unsupervised Skill Discovery through Skill Regions Differentiation

17 June 2025

Main:13 Pages

12 Figures

Bibliography:2 Pages

3 Tables

Abstract

Unsupervised Reinforcement Learning (RL) aims to discover diverse behaviors that can accelerate the learning of downstream tasks. Previous methods typically focus on entropy-based exploration or empowerment-driven skill learning. However, entropy-based exploration struggles in large-scale state spaces (e.g., images), and empowerment-based methods with Mutual Information (MI) estimations have limitations in state exploration. To address these challenges, we propose a novel skill discovery objective that maximizes the deviation of the state density of one skill from the explored regions of other skills, encouraging inter-skill state diversity similar to the initial MI objective. For state-density estimation, we construct a novel conditional autoencoder with soft modularization for different skill policies in high-dimensional space. Meanwhile, to incentivize intra-skill exploration, we formulate an intrinsic reward based on the learned autoencoder that resembles count-based exploration in a compact latent space. Through extensive experiments in challenging state and image-based tasks, we find our method learns meaningful skills and achieves superior performance in various downstream tasks.

View on arXiv

@article{xiao2025_2506.14420,
  title={ Unsupervised Skill Discovery through Skill Regions Differentiation },
  author={ Ting Xiao and Jiakun Zheng and Rushuai Yang and Kang Xu and Qiaosheng Zhang and Peng Liu and Chenjia Bai },
  journal={arXiv preprint arXiv:2506.14420},
  year={ 2025 }
}

Comments on this paper