CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1437
Title: On U-estimation of principal components when $n < p$ Authors:  Nuwan Weeraratne - University of Waikato (New Zealand) [presenting]
Lynette Hunt - University of Waikato (New Zealand)
Jason Kurz - University of Waikato (New Zealand)
Abstract: Principal components analysis (PCA) is a workhorse dimensionality-reduction technique widely used in practice to ensure model identifiability when the sample size, $n$ is exceeded by the data dimensionality, $p$. This is accomplished by transforming the original variables into a new set of variables (principal components - PCs), which are uncorrelated. The majority of the variation present in all of the original variables is retained in the first few PCs. As a result, a few PCs can express the complete variation of the data set. However, because the conventional covariance estimator does not converge to the true covariance matrix, standard PCA performs poorly as a dimensionality reduction technique in the $n<p$ large dimensional scenarios. Inspired by a fundamental issue associated with mean estimation when $n< p$, the advantages of employing a multivariate generalization to covariance matrix estimation are examined by a well-known U-estimator for the univariate variance. In simulation experiments, (typically small but) persistent improvements are demonstrated in the estimation of principal components vs. known ground truth with respect to the angular separation between the population and sample PCs.