49
8

Sample canonical correlation coefficients of high-dimensional random vectors: local law and Tracy-Widom limit

Abstract

Consider two random vectors C11/2xRp\mathbf C_1^{1/2}\mathbf x \in \mathbb R^p and C21/2yRq\mathbf C_2^{1/2}\mathbf y\in \mathbb R^q, where the entries of x\mathbf x and y\mathbf y are i.i.d. random variables with mean zero and variance one, and C1\mathbf C_1 and C2\mathbf C_2 are p×pp \times p and q×qq\times q deterministic population covariance matrices. With nn independent samples of (C11/2x,C21/2y)(\mathbf C_1^{1/2}\mathbf x,\mathbf C_2^{1/2}\mathbf y), we study the sample correlation between these two vectors using canonical correlation analysis. We denote by SxxS_{xx} and SyyS_{yy} the sample covariance matrices for C11/2x\mathbf C_1^{1/2}\mathbf x and C21/2y\mathbf C_2^{1/2}\mathbf y, respectively, and SxyS_{xy} the sample cross-covariance matrix. Then the sample canonical correlation coefficients are the square roots of the eigenvalues of the sample canonical correlation matrix CXY:=Sxx1SxySyy1Syx\cal C_{XY}:=S_{xx}^{-1}S_{xy}S_{yy}^{-1}S_{yx}. Under the high-dimensional setting with p/nc1(0,1){p}/{n}\to c_1 \in (0, 1) and q/nc2(0,1c1){q}/{n}\to c_2 \in (0, 1-c_1) as nn\to \infty, we prove that the largest eigenvalue of CXY\mathcal C_{XY} converges to the Tracy-Widom distribution as long as we have limss4[P(xijs)+P(yijs)]=0\lim_{s \rightarrow \infty}s^4 [\mathbb{P}(\vert x_{ij} \vert \geq s)+ \mathbb{P}(\vert y_{ij} \vert \geq s)]=0. This extends the result in [16], which established the Tracy-Widom limit of the largest eigenvalue of CXY\mathcal C_{XY} under the assumption that all moments are finite. Our proof is based on a linearization method, which reduces the problem to the study of a (p+q+2n)×(p+q+2n)(p+q+2n)\times (p+q+2n) random matrix HH. In particular, we shall prove an optimal local law on its inverse G:=H1G:=H^{-1}, i.e the resolvent. This local law is the main tool for both the proof of the Tracy-Widom law in this paper, and the study in [22,23] on the canonical correlation coefficients of high-dimensional random vectors with finite rank correlations.

View on arXiv
Comments on this paper