47
12

Equispaced Fourier representations for efficient Gaussian process regression from a billion data points

Abstract

We introduce a Fourier-based fast algorithm for Gaussian process regression in low dimensions. It approximates a translationally-invariant covariance kernel by complex exponentials on an equispaced Cartesian frequency grid of MM nodes. This results in a weight-space M×MM\times M system matrix with Toeplitz structure, which can thus be applied to a vector in O(MlogM){\mathcal O}(M \log{M}) operations via the fast Fourier transform (FFT), independent of the number of data points NN. The linear system can be set up in O(N+MlogM){\mathcal O}(N + M \log{M}) operations using nonuniform FFTs. This enables efficient massive-scale regression via an iterative solver, even for kernels with fat-tailed spectral densities (large MM). We provide bounds on both kernel approximation and posterior mean errors. Numerical experiments for squared-exponential and Mat\érn kernels in one, two and three dimensions often show 1-2 orders of magnitude acceleration over state-of-the-art rank-structured solvers at comparable accuracy. Our method allows 2D Mat\érn-\mbox{\frac{3}{2}} regression from N=109N=10^9 data points to be performed in 2 minutes on a standard desktop, with posterior mean accuracy 10310^{-3}. This opens up spatial statistics applications 100 times larger than previously possible.

View on arXiv
Comments on this paper