COMPSTAT 2023: Start Registration
View Submission - COMPSTAT2023
A0213
Title: Identification of important pairwise logratios in compositional data employing sparse principal component analysis Authors:  Viktorie Nesrstova - Palacky University, Olomouc (Czech Republic) [presenting]
Ines Wilms - Maastricht University (Netherlands)
Karel Hron - Palacky University Olomouc (Czech Republic)
Peter Filzmoser - TU Wien (Austria)
Abstract: Compositional data are data that carry relative information as their elemental information is contained in the pairwise logratios of the parts that constitute the composition. While pairwise logratios are typically easy to interpret, the number of such possible pairs to consider quickly grows, thereby leading to a potentially exhaustive analysis even for medium-sized compositions. Sparse principal component analysis (PCA) therefore forms an appealing tool to identify important pairwise logratios, and in turn, the important parts in the composition. To this end, we apply the sparse PCA to the possibly high-dimensional matrix of all pairwise logratios. The L1 penalty in the optimization problem serves as tradeoff between explained variability and sparsity in the loadings of the pairwise logratios. The procedure is demonstrated on both simulated and empirical (geochemical) data sets. To aid practitioners in the discovery of important pairwise logratios, we introduce three practical visualization tools that (i) balance between the explained variability and sparsity of the model, (ii) show the stability of pairwise logratios, and (iii) highlight the importance of each particular part in the composition.