COMPSTAT 2023: Start Registration
View Submission - COMPSTAT2023
A0219
Title: Mixture of linear mixed models for clustering weighted random graphs Authors:  Shu-Kay Angus Ng - Griffith University (Australia) [presenting]
Richard Tawiah - University of Melbourne (Australia)
Hien Nguyen - University of Queensland (Australia)
Florence Forbes - INRIA (France)
Abstract: Typical clustering methods assume observed data are independent. However, this assumption is often not valid with modern data (e.g., random graphs consisting of a set of nodes and a relational tie measured on each pair of nodes). Clustering methods that are not adequate to capture the complex dependence structure among highly correlated data often lead to biased estimates, misleading conclusions, lack of model fit, unstable inference, and inaccurate presentation of heterogeneity or data variability. The aim is to develop a new statistical approach using mixtures of linear mixed models (LMMs) for clustering weighted random graphs (where observed edge responses are real numbers). Random effects models have been a widely successful tool in capturing complex correlations among observations. Building on the developments in mixture models and LMMs, the proposed approach incorporates two sets of random effects in the linear predictor for the mean response to capture within-node and transitivity dependences and model node-level and paired node-level variability. Maximum likelihood estimation of the unknown parameters can be performed conditionally on the node-specific random effects within an incomplete-data framework of the expectation-maximisation (EM) algorithm. The proposed method is applied in comorbidity research using a real-world data set from the Australian National Health Survey (NHS) to identify clusters of comorbid medical conditions.