CMStatistics 2015: Start Registration
View Submission - CMStatistics
B1246
Topic: Contributed on Advances in mixture modelling Title: Robust estimation for mixtures of skew data Authors:  Francesca Greselin - University of Milano Bicocca (Italy) [presenting]
Luis Angel Garcia-Escudero - Universidad de Valladolid (Spain)
Agustin Mayo-Iscar - Universidad de Valladolid (Spain)
Geoffrey McLachlan - University of Queensland (Australia)
Abstract: Recently, observed departures from the classical Gaussian mixture model in real datasets have led to the introduction of more flexible tools for modeling heterogeneous skew data. Among the latest proposals in the literature, we consider mixtures of skew normal, to incorporate asymmetry in components, as well as mixtures of $t$, to down-weight the contribution of extremal observations. Clearly, mixtures of skew $t$ have widened the application of model based clustering and classification to great many real datasets, as they can adapt to both asymmetry and leptokurtosis in the grouped data. Unfortunately, when data contamination occurs far from the bulk of the data, or even between the groups, classical inference for these models is not reliable. Our proposal is to address robust estimation of mixtures of skew normal, to resist sparse outliers and even pointwise contamination that could arise in data collection. We introduce a constructive way to obtain a robust estimator for the mixture of skew normal model, by incorporating impartial trimming and constraints in the EM algorithm. At each E-step, a low percentage of less plausible observations, under the estimated model, is tentatively trimmed; at the M-step, constraints on the scatter matrices are employed to avoid singularities and reduce spurious maximizers. Some applications on artificial and real data show the effectiveness of our proposal, and the joint role of trimming and constraints to achieve robustness.