Quasi-Newton Quasi-Monte Carlo for variational Bayes

Many machine learning problems optimize an objective that must be measured with noise. The primary method is a first order stochastic gradient descent using one or more Monte Carlo (MC) samples at each step. There are settings where ill-conditioning makes second order methods such as L-BFGS more effective. We study the use of randomized quasi-Monte Carlo (RQMC) sampling for such problems. When MC sampling has a root mean squared error (RMSE) of then RQMC has an RMSE of that can be close to in favorable settings. We prove that improved sampling accuracy translates directly to improved optimization. In our empirical investigations for variational Bayes, using RQMC with stochastic L-BFGS greatly speeds up the optimization, and sometimes finds a better parameter value than MC does.
View on arXiv