EcoSta 2023: Start Registration
View Submission - EcoSta2023
A1139
Title: Improving multiple linear regression with random forest using Mahalanobis distance Authors:  Jaeseong Park - Korea University (Korea, South) [presenting]
Abstract: Multiple linear regression is a widely used statistical method for modelling the relationship between a dependent variable and multiple independent variables. Random forest, a popular ensemble learning method, has been shown to be effective in solving complex regression problems. A novel approach is proposed to multiple linear regression using random forest with Mahalanobis distance. Mahalanobis distance is a measure of the distance between a point and a distribution, which takes into account the covariance of the data. By incorporating Mahalanobis distance into the random forest algorithm, for the correlations can be accounted between the independent variables and reduce the influence of outliers. The details of the proposed method are presented and its performance with traditional multiple linear regression and random forest regression is compared.