EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0935
Title: Selection problems in multiple instance learning Authors:  Seongoh Park - Sungshin Women\'s University (Korea, South) [presenting]
Johan Lim - Seoul National University (Korea, South)
Xinlei Wang - University of Texas at Arlington (United States)
Tao Wang - UT Southwestern Medical Center (United States)
Abstract: In multiple instance, learning (MIL), a bag represents a sample that has a set of instances, each of which is described by a vector of explanatory variables, but the entire bag only has one label/response. Though many methods for MIL have been developed to date, few have paid attention to the interpretability of models and results. Two different models are considered to select instances or variables simultaneously. The first model is a Bayesian hierarchical regression model that addresses two selection problems simultaneously. To do it, the shotgun stochastic search algorithm is modified to fit in the MIL context. The model is applied to the musk data to predict binding strengths between molecules (bags) and receptors (instances). Another approach is multiple instance neural networks based on sparse attention. The sparse attention structure drops out uninformative instances in each bag, achieving both interpretability and better predictive performance. It is applied to a cancer detection problem where the one-to-many correspondence between a patient and multiple T cell receptors (TCR) sequences hinders researchers from simply adopting classical statistical/machine learning methods. Recent attempts to model this type of data still have room for improvement, especially for certain cancer types. Furthermore, explainable neural network models are not fully investigated. The proposed method aims to fill this gap.