In datasets where the number of parameters is fixed and the number of samples is large, principal component analysis (PCA) is a powerful dimension reduction tool. However, in many contemporary datasets, when the number of parameters is comparable to the sample size, PCA can be misleading. A closely related problem is the following: is it possible to recover a rank-one matrix in the presence of a large amount of noise? In both situations, there is a phase transition in the eigen-structure of the matrix.
View on arXiv