Fast Gaussian Process Regression for Big Data

17 September 2015

Abstract

Gaussian Processes are widely used for regression tasks. A known limitation in the application of Gaussian Processes to regression tasks is that the computation of the solution requires performing a matrix inversion. The solution also requires the storage of a large matrix in memory. These factors restrict the application of Gaussian Process regression to small and moderate size data sets. We present an algorithm based on empirically determined subset selection. The algorithm is based on applying model averaging to Gaussian Process estimators developed on bootstrapped datasets. We compare the performance of this algorithm with two other methods that are used to apply Gaussian Processes regression to large datasets. In the proposed method, hyper-parameter learning is performed over small datasets and requires very little tuning effort. Methods currently used to apply Gaussian Process regression to large datasets are typically associated with more hyper-parameters than the proposed method and can require a significant tuning effort. The results of the experiments reported in this work are consistent with results from Mini-max theory for non-parametric regression. The key benefit of this algorithm is the simplicity associated with implementation..

View on arXiv

Comments on this paper