CFE-CMStatistics 2024: Start Registration
View Submission - CFECMStatistics2024
A0336
Title: Interaction screening via Kendall's rank correlation for imbalanced multi-class classification Authors:  Shuntaro Tanaka - The Japan Research Institute, Limited (Japan) [presenting]
Hidetoshi Matsui - Shiga University (Japan)
Abstract: Screening is a useful method for selecting important variables for high-dimensional data where the number of predictors is much larger than the sample size. Screening can eliminate unnecessary variables at a low computational cost by calculating their importance scores, such as the correlation between the response and predictor variables. The problem of selecting interactions in classification problems for data with imbalanced sample sizes between classes is considered. Specifically, a method is proposed, called class-to-class KIF (CCKIF), to select interactions in imbalanced multi-class classification problems. CCKIF takes the difference in Kendall's rank correlations for each class to calculate the importance scores of the interactions, improving selection accuracy more than the existing method, even for imbalanced data. The theoretical properties of the proposed method are provided. Simulation studies and real data analysis show that the proposed CCKIF appropriately selects important interactions, especially for data on minor classes.