COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0262
Title: Comparison of the LASSO and IPF-LASSO methods for multi-omics data: Variable selection with Type I error control Authors:  Charlotte Castel - Oslo University Hospital (Norway) [presenting]
Abstract: Variable selection in high-dimensional regression modelling involving omics data is a hard problem, and establishing robust and dependable methods is essential. The IPF-LASSO model has advanced this field by allowing integration of diverse omics modalities, introducing distinct penalty parameters for each modality. However, controlling false positives when incorporating these heterogeneous data layers remains an unresolved challenge. To address this problem, we used stability selection for variable selection with error control. We applied stability selection to both the LASSO and IPF-LASSO, and the objective was to evaluate if the modality-specific penalties in the IPF-LASSO increase statistical power while maintaining error control. Analyses were conducted on two high-dimensional datasets, characterized by independent and correlated variables, respectively. Simulation studies indicated that while both methods were able to control false positives, IPF-LASSO increased power, especially under conditions with distinct differences in the relevance of variables across modalities. The different models were illustrated using data from a study on breast cancer treatment, where the IPF- LASSO model was able to select some highly relevant clinical variables. To our knowledge, this is the first study to integrate multiple correlated omics data modalities into a regression framework while controlling false positives.