32
1

Log-Concave Coupling for Sampling Neural Net Posteriors

Abstract

In this work, we present a sampling algorithm for single hidden layer neural networks. This algorithm is built upon a recursive series of Bayesian posteriors using a method we call Greedy Bayes. Sampling of the Bayesian posterior for neuron weight vectors ww of dimension dd is challenging because of its multimodality. Our algorithm to tackle this problem is based on a coupling of the posterior density for ww with an auxiliary random variable ξ\xi. The resulting reverse conditional wξw|\xi of neuron weights given auxiliary random variable is shown to be log concave. In the construction of the posterior distributions we provide some freedom in the choice of the prior. In particular, for Gaussian priors on ww with suitably small variance, the resulting marginal density of the auxiliary variable ξ\xi is proven to be strictly log concave for all dimensions dd. For a uniform prior on the unit 1\ell_1 ball, evidence is given that the density of ξ\xi is again strictly log concave for sufficiently large dd. The score of the marginal density of the auxiliary random variable ξ\xi is determined by an expectation over wξw|\xi and thus can be computed by various rapidly mixing Markov Chain Monte Carlo methods. Moreover, the computation of the score of ξ\xi permits methods of sampling ξ\xi by a stochastic diffusion (Langevin dynamics) with drift function built from this score. With such dynamics, information-theoretic methods pioneered by Bakry and Emery show that accurate sampling of ξ\xi is obtained rapidly when its density is indeed strictly log-concave. After which, one more draw from wξw|\xi, produces neuron weights ww whose marginal distribution is from the desired posterior.

View on arXiv
Comments on this paper