COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0486
Title: Structure learning for zero-inflated counts, with an application to single-cell RNA sequencing data Authors:  Thi Kim Hue Nguyen - University of Padova (Italy) [presenting]
Monica Chiogna - University of Bologna (Italy)
Davide Risso - University of Padua (Italy)
Abstract: In recent years, a growing interest has developed around the problem of retrieving, starting from observed data, the structure of graphs representing relationships among variables of interest. In fact, the reconstruction of a graphical model, known as structure learning, traces back to the beginning of the nineties, and a vast amount of literature exists that considers the problem from various perspectives within both frequentist and Bayesian approaches. However, molecular biology applications have played a central role in renewing interest in structure learning. In this field, the abundance of data with increasingly large sample sizes, driven by novel high-throughput technologies, has opened the door for the development and application of structure learning methods, in particular, applied to the estimation of gene regulatory or gene association networks. These, however, are challenging applications since the data consists of high-dimensional counts with high variance and over-abundance of zeros. A general framework is presented for learning the structure of a graph from single-cell RNA-seq data based on the zero-inflated negative binomial distribution. The approach is demonstrated with simulations to retrieve the structure of a graph in various settings, and the utility of the approach is shown on real data.