Variational Diffusion Auto-encoder: Deep Latent Variable Model with Unconditional Diffusion Prior

24 April 2023

Georgios Batzolis

Carola-Bibiane Schönlieb

Abstract

Variational auto-encoders (VAEs) are one of the most popular approaches to deep generative modeling. Despite their success, images generated by VAEs are known to suffer from blurriness, due to a highly unrealistic modeling assumption that the conditional data distribution $p(\textbf{x} | \textbf{z})$ can be approximated as an isotropic Gaussian. In this work we introduce a principled approach to modeling the conditional data distribution $p(\textbf{x} | \textbf{z})$ by incorporating a diffusion model. We show that it is possible to create a VAE-like deep latent variable model without making the Gaussian assumption on $p(\textbf{x} | \textbf{z})$ or even training a decoder network. A trained encoder and an unconditional diffusion model can be combined via Bayes' rule for score functions to obtain an expressive model for $p(\textbf{x} | \textbf{z})$ . Our approach avoids making strong assumptions on the parametric form of $p(\textbf{x} | \textbf{z})$ , and thus allows to significantly improve the performance of VAEs.

View on arXiv

Comments on this paper