Mitigating the Effects of Non-Identifiability on Inference for Bayesian Neural Networks with Latent Variables

Journal of machine learning research (JMLR), 2019

1 November 2019

Finale Doshi-Velez

Abstract

Bayesian Neural Networks with Latent Variables (BNN+LVs) provide uncertainties in prediction estimates by explicitly modeling model uncertainty (via priors on network weights) and environmental stochasticity (via a latent input noise variable). In this work, we first show that BNN+LV suffers from a serious form of non-identifiability: explanatory power can be transferred between model parameters and input noise while fitting the data equally well. We demonstrate that as a result, the posterior mode over the network weights and latent variables is asymptotically biased away from the ground truth, and as a result, traditional inference methods may yield parameters that generalize poorly and mis-estimate uncertainty. Next, we develop a novel inference procedure that explicitly mitigates the effects of likelihood non-identifiability during training and yields high quality predictions as well as uncertainty estimates. We demonstrate that our inference method improves upon benchmark methods across a range of synthetic and real data sets.

View on arXiv

Comments on this paper