Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

26 May 2024

Abstract

In this paper, we obtain the Berry-Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.

View on arXiv

@article{samsonov2025_2405.16644,
  title={ Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning },
  author={ Sergey Samsonov and Eric Moulines and Qi-Man Shao and Zhuo-Song Zhang and Alexey Naumov },
  journal={arXiv preprint arXiv:2405.16644},
  year={ 2025 }
}

Comments on this paper