On the inference about the spectra of high-dimensional covariance matrix based on noisy observations-with applications to integrated covolatility matrix inference in the presence of microstructure noise

In practice, observations are often contaminated by noise, making the resulting sample covariance matrix to be an information-plus-noise-type covariance matrix. Aiming to make inferences about the spectra of the underlying true covariance matrix under such a situation, we establish an asymptotic relationship that describes how the limiting spectral distribution of (true) sample covariance matrices depends on that of information-plus-noise-type sample covariance matrices. As an application, we consider the inference about the spectra of integrated covolatility (ICV) matrices of high-dimensional diffusion processes based on high-frequency data with microstructure noise. The (slightly modified) pre-averaging estimator is an information-plus-noise-type covariance matrix, and the aforementioned result, together with a (generalized) connection between the spectral distribution of true sample covariance matrices and that of the population covariance matrix, enables us to propose a two-step procedure to estimate the spectral distribution of ICV for a class of diffusion processes. An alternative estimator is further proposed, which possesses two desirable properties: it eliminates the impact of microstructure noise, and its limiting spectral distribution depends only on that of the ICV through the standard Mar\v{c}enko-Pastur equation. Numerical studies demonstrate that our proposed methods can be used to estimate the spectra of the underlying covariance matrix based on noisy observations.
View on arXiv