Title: Trees, forests, and networks
Authors: Gerard Biau - Universite Pierre et Marie Curie (France) [presenting]
Abstract: Decision tree learning is a popular data-modeling technique used for over fifty years in the fields of statistics, artificial intelligence, and machine learning. The history of trees goes on today with random forests, which are amongst the most successful machine learning algorithms currently available to handle large-scale and high-dimensional data sets. It is sometimes alluded to that forests have the flavor of deep network architectures, insofar as ensemble of trees allow to discriminate between a very large number of regions. However, the connection between random forests and neural networks is largely unexamined. Recent theoretical and methodological developments for random forests will be reviewed, with a special emphasis on the mathematical forces driving the algorithm. Next, the random forest method will be reformulated into a neural network setting, and two new hybrid procedures will be proposed. In a nutshell, given an ensemble of random trees, it is possible to restructure them as a collection of (random) multilayered neural networks, which have sparse connections and less restrictions on the geometry of the decision boundaries. Their activation functions are soft nonlinear and differentiable, thus trainable with a gradient-based optimization algorithm and expected to exhibit better generalization performance.