A0679
Title: Stochastic feature selection with annealing and its application to streaming data
Authors: Lizhe Sun - Shanxi University of Finance and Economics (China) [presenting]
Abstract: Feature selection is an important topic in high-dimensional statistics and machine learning for prediction and understanding of the underlying phenomena. It has many applications in computer vision, natural language processing, bioinformatics, etc. However, most feature selection methods in the literature have been proposed for offline learning, and the existing online feature selection methods have theoretical and practical limitations in true support recovery. Two novel online feature selection methods are proposed by stochastic gradient descent with a hard thresholding operator. The proposed methods can simultaneously select the relevant features and build linear regression or classification models based on the selected variables. The theoretical justification is provided for the consistency of the proposed methods. Numerical experiments on simulated and real sparse datasets show that the proposed methods compare favorably with state-of-the-art online methods from the literature.