14
12

Variational Orthogonal Features

Abstract

Sparse stochastic variational inference allows Gaussian process models to be applied to large datasets. The per iteration computational cost of inference with this method is O(N~M2+M3),\mathcal{O}(\tilde{N}M^2+M^3), where N~\tilde{N} is the number of points in a minibatch and MM is the number of `inducing features', which determine the expressiveness of the variational family. Several recent works have shown that for certain priors, features can be defined that remove the O(M3)\mathcal{O}(M^3) cost of computing a minibatch estimate of an evidence lower bound (ELBO). This represents a significant computational savings when MN~M\gg \tilde{N}. We present a construction of features for any stationary prior kernel that allow for computation of an unbiased estimator to the ELBO using TT Monte Carlo samples in O(N~T+M2T)\mathcal{O}(\tilde{N}T+M^2T) and in O(N~T+MT)\mathcal{O}(\tilde{N}T+MT) with an additional approximation. We analyze the impact of this additional approximation on inference quality.

View on arXiv
Comments on this paper