A0922
Title: A Gaussian mixture model with a modified Hard EM algorithm in clustering problems
Authors: Samyajoy Pal - LMU Munich (Germany) [presenting]
Christian Heumann - Ludwig-Maximilians-University Munich (Germany)
Abstract: Hard EM or Viterbi Training is often used for complex unsupervised learning models as it is less computationally intensive and easy to implement. However, it is considered to be inferior to standard EM as it is known to have some theoretical disadvantages, like biased estimates and lack of consistency. Also, in what circumstances it is to be preferred over the other is not well understood. We have revisited the issue of Hard EM for cluster analysis. We have proposed some modifications to the Hard EM algorithm to build Gaussian Mixture Models. The performance of the model has been assessed over different situations (increasing number of clusters, increasing dimension, increasing overlap, imbalance, etc.) on five benchmark data sets. Then the results are compared with standard EM to investigate if it really works as badly as assumed. The study also includes an analysis of two real data sets from biological science to explore the convenience of the proposed models.