Mutual Information Collapse Explains Disentanglement Failure in $β$ -VAEs

9 February 2026

Minh Vu

Xiaoliang Wan

Shuangqing Wei

DRL

AAML

CoGe

ArXiv (abs)PDF HTML Github

Main:7 Pages

8 Figures

Bibliography:3 Pages

2 Tables

Appendix:3 Pages

Abstract

The $\beta$ -VAE is a foundational framework for unsupervised disentanglement, using $\beta$ to regulate the trade-off between latent factorization and reconstruction fidelity. Empirically, however, disentanglement performance exhibits a pervasive non-monotonic trend: benchmarks such as MIG and SAP typically peak at intermediate $\beta$ and collapse as regularization increases. We demonstrate that this collapse is a fundamental information-theoretic failure, where strong Kullback-Leibler pressure promotes marginal independence at the expense of the latent channel's semantic informativeness. By formalizing this mechanism in a linear-Gaussian setting, we prove that for $\beta > 1$ , stationarity-induced dynamics trigger a spectral contraction of the encoder gain, driving latent-factor mutual information to zero. To resolve this, we introduce the $\lambda\beta$ -VAE, which decouples regularization pressure from informational collapse via an auxiliary $L_2$ reconstruction penalty $\lambda$ . Extensive experiments on dSprites, Shapes3D, and MPI3D-real confirm that $\lambda > 0$ stabilizes disentanglement and restores latent informativeness over a significantly broader range of $\beta$ , providing a principled theoretical justification for dual-parameter regularization in variational inference backbones.

View on arXiv

Comments on this paper

Mutual Information Collapse Explains Disentanglement Failure in βββ-VAEs

Mutual Information Collapse Explains Disentanglement Failure in $β$ -VAEs