High-dimensional Covariance Estimation by Pairwise Likelihood Truncation

10 July 2024

Abstract

Pairwise likelihood is a useful approximation to the full likelihood function for covariance estimation in high-dimensional context. It simplifies high-dimensional dependencies by combining marginal bivariate likelihood objects, thus making estimation more manageable. In certain models, including the Gaussian model, both pairwise and full likelihoods are maximized by the same parameter values, thus retaining optimal statistical efficiency, when the number of variables is fixed. Leveraging on this insight, we introduce estimation of sparse high-dimensional covariance matrices by maximizing a truncated version of the pairwise likelihood function, obtained by including pairwise terms corresponding to nonzero covariance elements. To achieve a meaningful truncation, we propose to minimize the $L_2$ -distance between pairwise and full likelihood scores plus an $L_1$ -penalty discouraging the inclusion of uninformative terms. Differently from other regularization approaches, our method focuses on selecting whole pairwise likelihood objects rather than shrinking individual covariance parameters, thus retaining the inherent unbiasedness of the pairwise likelihood estimating equations. This selection procedure is shown to have the selection consistency property as the covariance dimension increases exponentially fast. Consequently, the implied pairwise likelihood estimator is consistent and converges to the oracle maximum likelihood estimator assuming knowledge of nonzero covariance entries.

View on arXiv

Comments on this paper