Title: Loan default analysis in Europe: Tracking regional variations using big data
Authors: Luca Barbaglia - European Commission Joint Research Centre (Italy) [presenting]
Sebastiano Manzan - European Commission (Italy)
Elisa Tosetti - Brunel University London (United Kingdom)
Abstract: The loan default behaviour in the European market is empirically investigated using a novel, big data set on over 20 million residential mortgages observed over the period from 2013 to 2018. We model the occurrence of a default as a function of loan-level information at origination, characteristics of the financial institution originating the loan, borrower's economic situation, as well as local economic conditions. We adopt three alternative machine learning techniques useful for predicting default events, namely the Logistic Regression, the Gradient Boosting and the XGBoost approaches, and carry the analysis at NUTS2 regional-level. We find that the most important variables in explaining default is the loan originator, the interest rate currently applied to the mortgage, and local economic characteristics, while other loan- or borrower-specific features are less relevant. We exploit techniques from a recent literature on interpretable machine learning to identify the most relevant factors affecting default and to capture the non-linear effects on default of some variables, like interest rate or changes in unemployment rate. Our results point at consistent geographical heterogeneity in variable importance magnitudes, indicating the need of European policy that is regionally tailored.