Title: Variable role screening for high-dimensional model-based discriminant analysis
Authors: Michael Fop - University College Dublin (Ireland) [presenting]
Pierre-Alexandre Mattei - INRIA, Universite Cote d'Azur (France)
Thomas Brendan Murphy - University College Dublin (Ireland)
Charles Bouveyron - INRIA, Universite Cote d'Azur (France)
Abstract: Discriminant analysis is a popular supervised classification method used in a variety of fields. Application of this method to high-dimensional data faces major challenges and scalable variable selection plays an important role in the case of sample size much smaller than the number of predictor variables. Sure independence screening is a simple yet effective method to this purpose, consisting of large-scale screening followed by moderate-scale variable selection, with the aim to reduce the high-dimensional problem to a manageable scale. Although allowing for scalable inference, this approach is based on an independence assumption of the predictors which is often too restrictive in practical data analysis and masks the presence of potential redundant variables highly correlated with the actual relevant ones. We discuss a variable screening method which relaxes this independence assumption and where different models are specified according to the roles that each pair of predictor variables takes with respect to the target class variable, either relevant, redundant or uninformative. The approach is specified within a Bayesian framework, where the use of conjugate priors allows for fast and efficient computations of Bayes factors employed to compare the different models. The method is shown in application to data examples and its practical implementation in high-dimensional settings is discussed.