There is a lack of simple and scalable algorithms for uncertainty quantification. Bayesian methods quantify uncertainty through posterior and predictive distributions, but it is difficult to rapidly estimate summaries of these distributions, such as quantiles and intervals. Variational Bayes approximations are widely used, but may badly underestimate posterior covariance. Typically, the focus of Bayesian inference is on point and interval estimates for one-dimensional functionals of interest. In small scale problems, Markov chain Monte Carlo algorithms remain the gold standard, but such algorithms face major problems in scaling up to big data. Various modifications have been proposed based on parallelization and approximations based on subsamples, but such approaches are either highly complex or lack theoretical support and/or good performance outside of narrow settings. We propose a very simple and general posterior interval estimation algorithm, which is based on running Markov chain Monte Carlo in parallel for subsets of the data and averaging quantiles estimated from each subset. We provide strong theoretical guarantees and illustrate performance in several applications.
View on arXiv