CMStatistics 2016: Start Registration
View Submission - CMStatistics
B1364
Title: Paired indices for clustering evaluation: A typology Authors:  Margarida G M S Cardoso - Instituto Universitario de Lisboa-Business Research Unit-Lisboa (Portugal) [presenting]
Abstract: Paired indices of agreement are commonly used to measure the accordance between two partitions of the same data set. They are generally determined based on a cross-classification table of counts of pairs of observations both partitions agree to join and/or separate in the clusters. However, there are still open issues regarding the specific thresholds one should consider for each index to conclude about the degree of agreement between the partitions. We analyze the distribution of 14 indices under the null hypothesis (H0) of agreement occurring by chance to acquire new insights on the indices behavior. We resort to the IADJUST method to generate cross-classification tables under H0. The experimental scenario considers 3 clusters, balanced or unbalanced, poorly, moderately or well separated. The analysis suggests a new typology of paired indices of agreement. This result resorts to the indices adjusted values (values deducted from agreement by chance) and also in the indices distributional characteristics intra-scenarios (average, quantiles, range, standard deviation, coefficient of variation, skewness and kurtosis).