A1246
Title: Utilizing latent space representation for clustering chronic kidney disease subtypes via electronic health records
Authors: Ren-Hua Chung - National Health Research Institutes (Taiwan) [presenting]
Djeane Debora Onthoni - National Health Research Institutes (Taiwan)
Kuei-Yuan Lan - National Health Research Institutes (Taiwan)
Tsung-Hsien Huang - National Health Research Institutes (Taiwan)
Ying-Erh Chen - Tamkang University (Taiwan)
Abstract: Chorionic Kidney Disease (CKD) is a globally prevalent, multifaceted disease, with its root causes varying among patients, complicating the analysis, treatment, and prognosis prediction. The Electronic Health Record (EHR), a valuable data comprising diverse and longitudinal medical data, were utilized to scrutinize CKD subtypes. Nevertheless, the EHR data's high dimensionality, heterogeneity, and incomplete time series posed challenges. Considering the EHR data's chronological nature, an end-to-end framework was devised to cluster CKD subtypes, considering the time gap between patient visits. The framework, implemented using UK Biobank's EHR dataset, encompasses three stages: data preprocessing, transformation, and clustering. The Convolutional Autoencoder (ConvAE) architecture is employed to convert preprocessed data into a low-dimensional format, which is subsequently clustered using Principal Component Analysis (PCA) and K-means algorithms. The efficacy of the transformation and clustering steps is evaluated through accuracy, Silhouette, Purity, and Entropy scores. High scores confirmed the framework's efficacy, enabling us to decipher clinical patterns for a nuanced understanding of each CKD subtype within the respective clusters.