Title: A hybrid random forest to predict soccer matches in international tournaments
Authors: Hans Van Eetvelde - Ghent University (Belgium) [presenting]
Andreas Groll - Technical University Dortmund (Germany)
Christophe Ley - Ghent University (Belgium)
Gunther Schauberger - Technical University of Munich (Germany)
Abstract: A new hybrid modeling approach is proposed for the scores of international soccer matches which combines random forests with Poisson ranking methods. While the random forest is based on the competing teams covariate information, the latter method estimates ability parameters on historical match data that adequately reflect the current strength of the teams. We compare the new hybrid random forest model to its separate building blocks as well as to conventional Poisson regression models with regard to their predictive performance on all matches from the four FIFA World Cups 2002-2014. It turns out that by combining the random forest with the team ability parameters from the ranking methods as an additional covariate the predictive power can be improved substantially. Finally, the hybrid random forest is used (in advance of the tournament) to predict the FIFA World Cup 2018. To complete our analysis on the previous World Cup data, the corresponding 64 matches serve as an independent validation data set and we are able to confirm the compelling predictive potential of the hybrid random forest which clearly outperforms all other methods including the betting odds.