EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0972
Title: Integrative group factor model for variable clustering on temporally dependent data: Optimality and algorithm Authors:  Wen Zhou - Colorado State University (United States)
Lyuou Zhang - Shanghai Univeristy of Finance and Economics (China) [presenting]
Haonan Wang - Colorado State University (United States)
Abstract: A model-based clustering approach is adopted, in which the population-level clusters are clearly and statistically interpretable to cluster a larger number of variables. The integrative group factor model (iGFM) is proposed, which can handle temporally dependent data and allows for connections across variable clusters. This model introduces two types of latent factors, the common and unique factors, to model cross-cluster connections and within-cluster similarities among variables. The difficulty of clustering variables based on the iGFM in terms of a permutation-invariant clustering risk is quantified, and the minimax signal threshold below is derived, which no algorithms can cluster variables successfully. This threshold is driven by the competition between common and unique factors in the model and does not require a clear separation of clusters to guarantee perfect recovery. Using spectral decomposition and linear search techniques, a fast and minimax-optimal algorithm is developed to cluster variables. An interesting phase transition in the clustering performance is also identified, where the model parameter space is partitioned into three regions corresponding to cases of impossible to cluster perfectly, possible with guarantees on optimality, and possible with no provable guarantees, respectively.