CFE-CMStatistics 2025: Start Registration
View Submission - CFE-CMStatistics 2025
A0712
Title: Modularity-guided and dominant-set-based semi-supervised clustering with metric learning for functional data Authors:  Xiang Wang - Shanxi University (China) [presenting]
Honglang Wang - Indiana University Indianapolis (United States)
Abstract: Dominant-set clustering is a graph and game-theoretic approach that identifies cohesive groups by maximizing within-cluster similarity, different from common methods such as k-means, spectral, and hierarchical clustering. A dominant-set-based hierarchical bipartition procedure is proposed, formulated as a penalized optimization problem, with the tuning parameter selected to maximize the modularity of the resulting two clusters. The proposed method is applied to functional data clustering with a flexible choice of similarity measures between curves. It is not only robust to imbalanced groups but also to outliers, which overcomes the limitation of many existing clustering methods. A thorough semi-supervised clustering method is further proposed, which learns the metric by modularity maximization over a linear combination of similarity metric candidates from the labeled portion of the data, and performs hierarchical dominant-set based clustering tuned by modularity maximization. The proposed algorithm is not only able to learn a global metric but also able to learn individual metrics for each cluster, which permits innovative clustering with overlapping clusters. This is a general clustering method and is superiorly applicable to functional data, which in nature encompasses a variety of metrics for comparing curves. Empirical investigations using simulation studies and real data applications demonstrate the advantages of the proposed methods.