CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0271
Title: Classification of imbalanced class labels with and without feature selection model Authors:  Senthil Murugan Nagarajan - University of Luxembourg (Luxembourg) [presenting]
Abstract: In recent days, the increase in data volume and high dimensionality has become a more complex problem for researchers to develop accurate classification models. Furthermore, classification based on imbalanced class labels is the most common issue that persists with the increase based on the above statement. Standard classifier learning algorithms face abrupt decreases in accuracy or performance due to such imbalance classification problems where the minority classes are likely to be misclassified compared to the majority class. Moreover, various research has proven that feature selection techniques can improve the accuracy or performance of the classifier model by reducing the number of features. However, still, this case has not proven to be the best when it comes to imbalanced classification. With and without various feature selection techniques such as PCA, Baruto, BarutoShap, and Lasso regression are discussed for class imbalance dataset classification using various machine learning techniques such as random forest (RF), hybrid ensemble learning, k-nearest neighbour, Light GBM classifier, and logistic regression. Beforehand, some statistical analysis is done for the dataset to better understand dependent and independent variables. Three imbalanced datasets are used such as for the analysis. The model's performance is shown based on different metrics such as F1-Score, precision, recall, and accuracy.