A1114
Title: Feature selection and outcome prediction using kidney pathomic data
Authors: Jarcy Zee - University of Pennsylvania (United States) [presenting]
Qian Liu - Childrens Hospital of Philadelphia (United States)
Jeremy Rubin - University of Pennsylvania (United States)
Fan Fan - Emory University (United States)
Laura Barisoni - Duke University (United States)
Andrew Janowczyk - Emory University (United States)
Abstract: Kidney biopsy remains the gold standard for diagnosis and aids in prognostication for kidney diseases. Computational pathology leverages deep learning and automated image analysis technologies to quantitatively and comprehensively extract features from histological structures within digital biopsy images. Statistical analyses of the resulting pathomic data are challenged by their high-dimensional, hierarchical, and unbalanced nature. Kidney tubule pathomic data is used from the Nephrotic Syndrome Study Network (NEPTUNE) and Cure Glomerulonephropathy (CureGN) study to identify important morphologic features and predict clinical outcomes among patients with glomerular diseases. Two sets of analyses were conducted: first, tubule-level features were aggregated to the patient level using first- through fourth-order summary statistics, principal component analysis was used to group highly correlated features, minimum redundancy maximum relevance was used to rank feature groups, and a series of ridge regressions was used to select top features and predict survival outcomes; second, a novel CLUstering Structured lasSO (CLUSSO) scalar-on-matrix regression technique was developed and applied, which used cluster analysis to group similar tubules based on feature values and then used structured lasso to select important features and predict kidney function. These approaches allow for the selection of interpretable and clinically relevant pathogenic features to inform prognosis.