Inverse reinforcement learning (IRL) aims to explain observed complex behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action paths. This is a problem in many real-world modelling settings, where only more limited observations are easily available. To address this issue, we extend the traditional IRL problem formulation. We call this new formulation the inverse reinforcement learning from summary data (IRL-SD) problem, where instead of state-action paths, only summaries of the paths are observed. We propose exact and approximate methods for both maximum likelihood and full posterior estimation for IRL-SD problems. Through case studies we compare these methods, demonstrating that the approximate methods can be used to solve moderate-sized IRL-SD problems in reasonable time.
View on arXiv