EcoSta 2024: Start Registration
View Submission - EcoSta 2025
A1211
Title: Semi-nonnegative matrix factorization for analyzing qualitative and quantitative data Authors:  Hiroyasu Abe - Wakayama Medical University (Japan) [presenting]
Abstract: Principal component analysis or factor analysis is unsuitable for mixed data, which includes both continuous and categorical variables, such that factor loadings for categorical variables are often underestimated. While some methods attempt to address this, they lack a well-defined statistical model and provide factor loadings that are hard to compare between continuous and categorical variables. Moreover, factors used in these methods are mostly orthogonal and hence cannot flexibly capture the structure hidden behind data. A new matrix decomposition method was developed for an exploratory analysis of such mixed-type data. The proposed method is based on statistical modeling in which normal and multinomial distributions are assumed for quantitative and qualitative variables, respectively. This enables the avoidance of underestimation of categorical variables and their interpretation in an accustomed manner with an odds ratio. In addition, the constraint on factor scores being nonnegative leads to relative flexibility in feature extraction. A numerical example demonstrates that the proposed method is more suitable than the other related methods for mixed data in terms of the root mean square error of factor loadings. Additionally, real clinical study data were used to demonstrate that the proposed method provides two or more flexible factors, which was permitted to be similar to but partially different, in contrast to orthogonal factors by existing methods.