130

Disentanglement by means of action-induced representations

Gorka Muñoz-Gil
Hendrik Poulsen Nautrup
Arunava Majumder
Paulin de Schoulepnikoff
Florian Fürrutter
Marius Krumm
Hans J. Briegel
Main:8 Pages
7 Figures
Bibliography:2 Pages
5 Tables
Appendix:9 Pages
Abstract

Learning interpretable representations with variational autoencoders (VAEs) is a major goal of representation learning. The main challenge lies in obtaining disentangled representations, where each latent dimension corresponds to a distinct generative factor. This difficulty is fundamentally tied to the inability to perform nonlinear independent component analysis. Here, we introduce the framework of action-induced representations (AIRs) which models representations of physical systems given experiments (or actions) that can be performed on them. We show that, in this framework, we can provably disentangle degrees of freedom w.r.t. their action dependence. We further introduce a variational AIR architecture (VAIR) that can extract AIRs and therefore achieve provable disentanglement where standard VAEs fail. Beyond state representation, VAIR also captures the action dependence of the underlying generative factors, directly linking experiments to the degrees of freedom they influence.

View on arXiv
Comments on this paper