EcoSta 2024: Start Registration
View Submission - EcoSta2024
A1037
Title: Vertex cover matroid variable selection Authors:  Toby Kenney - Dalhousie University (Canada) [presenting]
Hong Gu - Dalhousie University (Canada)
Sarah Organ - Dalhousie University (Canada)
Abstract: Medium-to-high dimensional variable selection is plagued by the issue of correlation. When predictors are highly correlated, it is often impossible to tell which is "true". For prediction purposes, this is not a serious issue. However, when the interest is in controlling the false discovery rate, it becomes impossible to achieve good variable selection. In order to overcome this issue, a new paradigm is developed for variable selection, where instead of selecting a single set of variables, a list is provided of possible sets of true variables. This allows for making selections of the form "one of this pair of variables" when appropriate. Vertex cover matroids of graphs are found to be an effective structure for selecting variables in this paradigm. A challenge is defining the false positive and true positive rates when we are not selecting individual variables. By viewing variable selection in the right way, there is a very natural extension of the usual definitions to the current case, and while computation of these true positive and false positive rates is theoretically NP-hard, in practice, it is usually fairly easy to compute them. Through simulation studies, the new paradigm is shown to control the false discovery rate at the desired level while greatly increasing the true discovery rate compared with state-of-the-art methods. It is also shown that the selected variables have better predictive ability than the variables selected by other methods.