A0221
Title: Computational strategies for regression model selection in the high-dimensional case
Authors: Marios Demosthenous - Cyprus University of Technology (Cyprus) [presenting]
Cristian Gatu - Alexandru Ioan Cuza University of Iasi (Romania)
Erricos Kontoghiorghes - Cyprus University of Technology and Birkbeck University of London, UK (Cyprus)
Abstract: Computational strategies for finding the best-subset regression models are proposed. The case of high-dimensional (HD) data where the number of variables exceeds the number of observations is considered. Within this context, a theoretical combinatorial solution is proposed. It is based on a regression tree structure that generates all possible subset models. An efficient branch-and-bound algorithm that finds the best submodels without generating the entire tree is adapted to the HD case. Furthermore, the R package lmSubsets is employed in the HD case to identify the best submodel based on the AIC family selection criteria. Preliminary experimental results are presented and analyzed. The efficient extension of the lmSelect algorithm to HD is discussed.