View Submission

A0929

Title: Interpretable network-assisted prediction via random forests Authors: Tiffany Tang - University of Notre Dame (United States) [presenting]
Abstract: Machine learning algorithms often assume that training samples are independent. However, when data points are connected by a network, it creates a dependency between samples, which is a challenge, reducing effective sample size and an opportunity to improve prediction by leveraging information from network neighbors. Multiple prediction methods taking advantage of this opportunity are now available. Many methods, including graph neural networks, are not easily interpretable, limiting their usefulness in the biomedical and social sciences, where understanding how a model makes its predictions is often more important than the prediction itself. Some are interpretable, for example, network-assisted linear regression, but generally do not achieve similar prediction accuracies as more flexible models. This gap is bridged by proposing a family of flexible network-assisted models built upon a generalization of random forests (RF+), which both achieve highly competitive prediction accuracy and can be interpreted through feature importance measures. In particular, a suite of novel interpretation tools is provided that enable practitioners to not only identify important features that drive model predictions but also quantify the importance of the network contribution to prediction. This suite of general tools broadens the scope and applicability of network-assisted machine learning for high-impact problems where interpretability and transparency are essential.