A0862
Title: A compositional regression analysis for categorical predictors not jointly observed with a correlating response
Authors: Juyeon Oh - Chonnam National University (Korea, South) [presenting]
Kwangmin Lee - Chonnam National University (Korea, South)
Abstract: In real-world data analysis, there are cases where variables are not jointly observed. When each variable is investigated by an independent survey, it is difficult to estimate the correlation between the variables. A methodology is proposed to address this problem by combining individuals with similar characteristics into a group and then treating the group as a single data point. Thus, a theoretical foundation is presented, and the method is verified. Additionally, in a survey that uses multiple-choice questions, the proportion of responses to each question can be treated as compositional data. Compositional data is useful when a breakdown of components is important, but standard statistical models are difficult to apply because it has a sum constraint. Furthermore, these surveys often contain numerous questions, and including all of them in a regression model could lead to unnecessary complexity and reduce interpretability. To handle this, a model is proposed that combines an exponential penalty for bi-level selection with a log-contrast model, which was introduced to relax the sum constraints of compositional data. The superiority of the model is demonstrated via simulations, and real data analysis is conducted using two separate national statistical data sources.