CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1440
Title: Debiasing SHAP scores in tree ensembles Authors:  Markus Loecher - Berlin School of Economics and Law (Germany) [presenting]
Abstract: Black box machine learning models are currently used for high-stakes decision-making in various parts of society, such as healthcare and criminal justice. While tree-based ensemble methods such as random forests typically outperform deep learning models on tabular data sets, their built-in variable importance algorithms are known to be strongly biased towards high-entropy features. It was recently shown that the increasingly popular SHAP (Shapley Additive explanations) values suffer from a similar bias. Debiased or "shrunk" SHAP scores are proposed based on sample splitting, which additionally enables the detection of overfitting issues at the feature level.