B1027
Title: Variable selection in binary data with few events and possible separation
Authors: Emmanuel Ogundimu - University of Durham (United Kingdom) [presenting]
Abstract: The lasso-type methods are commonly used for variable selection in binary data due to their shrinkage property and prediction accuracy. However, they can be inconsistent in selecting variables when the outcome of interest is rare or when the true underlying model has a sparse representation. This issue is further exacerbated in the presence of separation, where one or more model covariates perfectly predict the outcome. A possible solution to this challenge is combining methods such as the Firth penalized and lasso-type methods. The Firth method produces finite parameter estimates even in the presence of separation, while the regularized methods promote sparsity. Although the Firth penalized likelihood approach effectively reduces bias in regression coefficients when events are rare, it can be tuned to achieve a good trade-off between separation and stability. The tuned version serves as an intermediate between Firth and no penalty. Consequently, a two-stage method for variable selection is proposed. In the first stage, the Firth-type method is tuned, and in the second stage, a lasso-type method is applied to the tuned estimator. Extensive simulation studies are conducted to examine the performance of our proposed procedures in finite samples. We discuss extensions to data with grouped covariates.