A0210
Title: Selection of relevant pairwise logratios for high-dimensional compositional data
Authors: Paulina Jaskova - Palacky University Olomouc (Czech Republic) [presenting]
Karel Hron - Palacky University (Czech Republic)
Javier Palarea-Albaladejo - University of Girona (Spain)
Matthias Templ - University of Applied Sciences and Arts Northwestern Switzerland (Switzerland)
Abstract: In microbiome data analysis, one of the most important steps is to identify biomarkers. Microbiome data are characterized as high-dimensional compositional data, that is, relative data where the relevant information is contained in logratios between variables. Biomarkers usually can be represented by pairwise logratios which provide the key contained information. However, due to the high dimensionality of the data, it is a challenge, from a statistical perspective, to analyse all possible pairwise logratios since from each $D$-part composition, $D \cdot (D-1)/2$ pairwise logratios are derived. The main goal of this contribution is to present an algorithm to help us solve the problem of a high number of orthonormal logratio coordinate representations, which are needed for representation of individual logratios. The algorithm is based on the latin square theory for creating $D-1$ coordinate systems (balances), containing all logratios, which could be subsequently used in partial least squares regression to identify significant logratios. The properties of this new approach will be investigated using real high-dimensional compositions.