EcoSta 2023: Start Registration
View Submission - EcoSta2023
A0591
Title: Two tools for preliminary data analysis: Entropic plots for tail identification and heatmap for variable clustering Authors:  Jialin Zhang - Mississippi State University (United States) [presenting]
Abstract: First, a non-parametric method with entropic plots is introduced to identify the thickness of tails for the underlying discrete distributions. Based on the sample, the method identifies the thickness of the tail and compares it with Four benchmark tails: 1) power decaying tails, 2) sub-exponential decaying tails, 3) near-exponential decaying tails, and 4) exponential decaying tails. The first three benchmarks are considered thick tails. When an underlying discrete distribution is identified as a thick-tailed distribution, the method can provide a point estimation and an interval estimation for the parameters to assist further parametric analysis. Second, a non-parametric method with Generalized Shannons Entropy (GSE) is introduced to identify the similarities among ordered discrete distributions. The underlying ordered discrete distributions are not restricted to a common sample space. The method is theoretically supported by a characterization theorem with a set of GSEs. Based on the GSEs, a heatmap can be constructed to provide a simple and intuitive visualization to compare the ordered probability distributions for all the discrete random variables to help obtain a preliminary understanding of the variables.