We introduce a stochastic variational inference procedure for training scalable Gaussian process (GP) models whose per-iteration complexity is independent of both the number of training points, , and the number basis functions used in the kernel approximation, . Our central contributions include an unbiased stochastic estimator of the evidence lower bound (ELBO) for a Gaussian likelihood, as well as a stochastic estimator that lower bounds the ELBO for several other likelihoods such as Laplace and logistic. Independence of the stochastic optimization update complexity on and enables inference on huge datasets using large capacity GP models. We demonstrate accurate inference on large classification and regression datasets using GPs and relevance vector machines with up to basis functions.
View on arXiv