14
8

Partial Correlation Screening for Estimating Large Precision Matrices, with Applications to Classification

Abstract

We propose Partial Correlation Screening (PCS) as a new row-by-row approach to estimating a large precision matrix Ω\Omega. To estimate the ii-th row of Ω\Omega, 1ip1 \leq i \leq p, PCS uses a Screen step and a Clean step. In the Screen step, PCS recruits a (small) subset of indices using a stage-wise algorithm, where in each stage, the algorithm updates the set of recruited indices by adding the index jj that has the largest (in magnitude) empirical partial correlation with ii. In the Clean step, PCS re-investigates all recruited indices and use them to reconstruct the ii-th row of Ω\Omega. PCS is computationally efficient and modest in memory use: to estimate a row of Ω\Omega, it only needs a few rows (determined sequentially) of the empirical covariance matrix. This enables PCS to execute the estimation of a large precision matrix (e.g., p=10Kp=10K) in a few minutes, and open doors to estimating much larger precision matrices. We use PCS for classification. Higher Criticism Thresholding (HCT) is a recent classifier that enjoys optimality, but to exploit its full potential in practice, one needs a good estimate of the precision matrix Ω\Omega. Combining HCT with any approach to estimating Ω\Omega gives a new classifier: examples include HCT-PCS and HCT-glasso. We have applied HCT-PCS to two large microarray data sets (p=8Kp = 8K and 10K10K) for classification, where it not only significantly outperforms HCT-glasso, but also is competitive to the Support Vector Machine (SVM) and Random Forest (RF). The results suggest that PCS gives more useful estimates of Ω\Omega than the glasso. We set up a general theoretical framework and show that in a broad context, PCS fully recovers the support of Ω\Omega and HCT-PCS yields optimal classification behavior. Our proofs shed interesting light on the behavior of stage-wise procedures.

View on arXiv
Comments on this paper