22
44

Independence test for high dimensional data based on regularized canonical correlation coefficients

Abstract

This paper proposes a new statistic to test independence between two high dimensional random vectors X:p1×1{\mathbf{X}}:p_1\times1 and Y:p2×1{\mathbf{Y}}:p_2\times1. The proposed statistic is based on the sum of regularized sample canonical correlation coefficients of X{\mathbf{X}} and Y{\mathbf{Y}}. The asymptotic distribution of the statistic under the null hypothesis is established as a corollary of general central limit theorems (CLT) for the linear statistics of classical and regularized sample canonical correlation coefficients when p1p_1 and p2p_2 are both comparable to the sample size nn. As applications of the developed independence test, various types of dependent structures, such as factor models, ARCH models and a general uncorrelated but dependent case, etc., are investigated by simulations. As an empirical application, cross-sectional dependence of daily stock returns of companies between different sections in the New York Stock Exchange (NYSE) is detected by the proposed test.

View on arXiv
Comments on this paper