B0787
Title: A novel generalized extreme value gradient boosting decision tree for the class imbalanced problem in credit scoring
Authors: Raffaella Calabrese - University of Edinburgh (United Kingdom) [presenting]
Yizhe Dong - Univerisity of Edinburgh (United Kingdom)
Junfeng Zhang - University of Edinburgh (United Kingdom)
Abstract: The performance of the credit scoring models can be compromised when dealing with imbalanced datasets, where the number of defaulted borrowers is significantly lower than that of non-defaulters. A gradient-boosting decision tree with the generalized extreme value distribution model (GEV-GBDT) is proposed to address the imbalance learning problem. The performance of the approach is examined using four real-life loan datasets. The empirical result shows that the GEV-GBDT model achieves superior classification performance compared with other commonly used imbalance learning methods, including synthetic minority oversampling technique and cost-sensitive framework. Furthermore, performance tests are conducted on a series of purposely designed datasets with varying imbalance ratios and find that GEV-GBDT performs quite well even on extremely imbalanced datasets.