Bayesian Hypernetworks

We study Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork is a neural network which learns to transform a simple noise distribution, , to a distribution over the parameters of another neural network (the "primary network")\@. We train with variational inference, using an invertible to enable efficient estimation of the variational lower bound on the posterior via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap iid sampling of~. In practice, Bayesian hypernets can provide a better defense against adversarial examples than dropout, and also exhibit competitive performance on a suite of tasks which evaluate model uncertainty, including regularization, active learning, and anomaly detection.
View on arXiv