Scalable Thompson Sampling using Sparse Gaussian Process Models

9 June 2020

Abstract

Thompson Sampling (TS) with Gaussian Process (GP) models is a powerful tool for optimizing non-convex objective functions. Despite favorable theoretical properties, the computational complexity of the standard algorithms quickly becomes prohibitive as the number of observation points (i.e. the time horizon) grows. Scalable TS methods can be implemented using sparse GP models, but at the price of an approximation error that invalidates the existing regret bounds. Here, we prove regret bounds for TS based on approximate GP posteriors, whose application to sparse GPs shows that the improvement in computational complexity can be achieved with no loss in terms of the order of regret performance. Specifically, when necessary conditions on some algorithmic parameters are satisfied, we show an $\tilde{O}(\gamma_T\sqrt{ T})$ bound on the regret performance of TS using sparse GP models where $\gamma_T$ is an upper bound on the information gain between the observations and the underlying model.

View on arXiv

Comments on this paper