Title: Variable selection in the presence of factors: A model selection perspective
Authors: Rui Paulo - ISEG/CEMAPRE, Universidade de Lisboa (Portugal) [presenting]
Gonzalo Garcia-Donato - Department of Economics and Finance - Universidad de Castilla La Mancha - Instituto de Desarrollo Regional (Spain)
Abstract: The variable selection problem where the set of potential predictors contains both factors and numerical variables is considered. There are two possible approaches to variable selection: the estimation-based and the model-selection-based. In the former, the model containing all the potential predictors is estimated and a criterion for excluding variables is devised based on the estimate of the associated parameters. In the latter, all $2^p$ models are considered, where $p$ stands for the number of potential predictors, and variable selection is based on the posterior distribution on the model space. Inducing sparsity is a major challenge in the estimation-based approach, while the model-selection-based techniques are subject to the issue of multiplicity. We approach the variable selection problem in the presence of factors via the model selection perspective. Formally, this is a particular case of the standard variable selection setting where factors are coded using dummy variables. Nevertheless, we show several inputs like the assignment of prior probabilities over the model space or the parameterization adopted for factors may have a large (and difficult to anticipate) impact on the results. We provide a solution for these issues that extends the proposals in the standard variable selection problem and does not depend on how the factors are coded using dummy variables. Additionally, our method exhibits a very competitive frequentist behavior.