Abstract
Principal component analysis (PCA) is a ubiquitous method of multivariate statistics that focuses on the eigenvalues and eigenvectors of the sample covariance matrix of a data set. We consider p, N-dimensional data vectors drawn from a distribution with covariance matrix We use the replica method to evaluate the expected eigenvalue distribution as with for some fixed In contrast to existing studies we consider the case where contains a number of symmetry-breaking directions, so that the sample data set contains some definite structure. Explicitly we set with We find that the bulk of the eigenvalues are distributed as for the case when the elements of are independent and identically distributed. With increasing a series of phase transitions are observed, at each time a single function, separates from the upper edge of the bulk distribution, where We confirm the results of the replica analysis by studying the Stieltjes transform of This suggests that the results obtained from the replica analysis are universal, irrespective of the distribution from which is drawn, provided the fourth moment of each element of exists.
- Received 2 April 2003
DOI:https://doi.org/10.1103/PhysRevE.69.026124
©2004 American Physical Society