COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0287
Title: Bayesian hierarchical latent variable-based modelling for large and complex genomic datasets Authors:  Mayetri Gupta - University of Glasgow (United Kingdom) [presenting]
Abstract: Advances in genomic sequencing technologies in the past few decades have opened up the possibility of making previously inconceivable biological discoveries at extremely high resolution- but have led to numerous challenges in accurately analysing the generated data. These data are typically of huge dimension- leading to computational obstacles; are subject to various artefacts; and their distributions exhibit complex features, such as long-ranging correlations, non-ellipsoidal shapes, skewness and multimodality, causing difficulties in inference through standard statistical models. We will discuss some recent examples of Bayesian hierarchical modelling and inference for complex genomic data, along with robust, efficient and powerful computational methods enabling inference and biological discovery. One example involves clustering non-ellipsoidal data- finding subgroups with common features is often a necessary first step in the statistical analysis of large and complex genomic datasets. Another relates to detecting differential epigenetic profiles from high-throughput sequencing data. The performance of these methods is illustrated in simulation studies and applications to real-life examples from genotyping, and DNA methylation studies. This is based on joint work with Edoardo Redivo, Hien Nguyen, Huizi Zhang, Ben Swallow, Tushar Ghosh, Vincent Macaulay and Peter Adams.