COMPSTAT 2023: Start Registration
View Submission - COMPSTAT2023
A0254
Title: Chestnut plot to visualize aggregated symbolic data Authors:  Junji Nakano - Chuo University (Japan) [presenting]
Nobuo Shimizu - The Institute of Statistical Mathematics (Japan)
Yoshikazu Yamamoto - Tokushima Bunri University (Japan)
Abstract: When we have a very large amount of data, we are sometimes interested in comparing meaningful groups of data rather than individual observations. Aggregated symbolic data (ASD) expresses a group of observations that have continuous and categorical variables by using up to the second moments of the variables. The ASD for a group of data is equivalent to the set of means, variances, and correlations for continuous variables, the Burt matrix for categorical variables, and the means of a continuous variable versus a value of a categorical variable. Because ASD with many categorical variables is still complicated, it is preferable to have simple measures of the location and dispersion for a categorical variable, and measures of the correlation between two categorical and/or continuous variables. We propose such measures and use them to visualize ASD using an extension of multiple correspondence analysis. We refer to the proposed graph as a chestnut plot because of the shape of each ASD represented.