CMStatistics 2023: Start Registration
View Submission - CMStatistics
B1326
Title: Band depth based initialization of k-means for functional data clustering Authors:  Aurora Torrente Orihuela - Universidad Carlos III de Madrid (Spain) [presenting]
Javier Albert Smet - Universidad Carlos III de Madrid (Spain)
Juan Romo - Universidad Carlos III de Madrid (Spain)
Abstract: The k-means algorithm is a popular choice for clustering multivariate data but is well-known to be sensitive to the initialization process. A substantial number of methods aim to find optimal initial seeds, though none of them are universally valid. One such method is the BRIk algorithm, which relies on clustering a set of centroids derived from bootstrap replicates of the data and on the use of the versatile modified band depth. This algorithm can be extended to functional data in different ways by first adding a step where appropriate B-splines are fitted to the observations. A resampling process allows computational feasibility and handling issues such as noise or missing data. Two techniques for providing suitable initial seeds for functional data have been derived, each stressing the observations' multivariate or functional nature respectively. Results on simulated and real data indicate that the functional data approach to the BRIK method (FABRIk) and the functional data extension of the BRIK method (FDEBRIk) is more effective than previous proposals in terms of clustering recovery.