CMStatistics 2015: Start Registration
View Submission - CMStatistics
B0845
Topic: Contributed on Robust statistics in R Title: Outlier detection in complex survey data including semi-continuous components and missing values Authors:  Matthias Templ - Vienna University of Technology (Austria) [presenting]
Peter Filzmoser - Vienna University of Technology (Austria)
Olivier Dupriez - World Bank (United States)
Abstract: Poverty and inequality are measured based on household consumption, income expenditure data, or household income data. Outliers may especially introduce large variances of indicators, but also measurement errors may lead to biased estimates. Especially, indicators such as the Gini coefficient or the Quintile Share Ratio (QSR) are highly sensitive to outliers if non-robust estimation is applied. Adapting and implementing a collection of techniques for detecting outliers and for fixing them by imputation is considered. Important issues concerning the data and outlier detection methods are the number of missing values in each data set as well as structural zeros in income or consumption components. A main focus lies in the understanding of more than ten different outlier detection and imputation methods and their influence of the estimated Gini coefficient as well as a recommendation on which of the outlier detection methods should be preferred for household expenditure data. Additionally to the finding of a simulation study, outlier detection and imputation are carried out on six real-world consumption data and the results are presented.