CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1431
Title: Integrative nearest neighbor classifiers for block-missing multi-modal data Authors:  Guan Yu - University of Pittsburgh (United States) [presenting]
Abstract: Classifiers leveraging multi-modal data often have excellent classification performance. However, in certain studies, due to various reasons, some modalities are not collected from a sizable subset of participants, and thus all data from those modalities are missing completely. Considering classification problems with a block-missing multi-modal training data set, a new integrative nearest neighbour (INN) classifier is developed. INN harnesses all available information in the training data set and the feature vector of the test data point effectively to predict the class label of the test data point without deleting or imputing any missing data. Given a test data point, INN determines the weights on the training samples adaptively by minimizing the worst-case upper bound on the estimation error of the regression function over a convex class of functions. As a weighted nearest neighbour classifier, INN suffers from the curse of dimensionality. Therefore, in high-dimensional scenarios, a two-step INN is proposed, assuming that the regression function depends on features via sparse linear combinations of features. The two-step INN estimates those linear combinations first and then uses them as new features to build the classifier. The effectiveness of the proposed methods has been demonstrated by both theoretical and numerical studies.