CMStatistics 2023: Start Registration
View Submission - CMStatistics
B0468
Title: Directional protein models as computer programs Authors:  Ola Roenning - University of Copenhagen (Denmark) [presenting]
Abstract: Determining the native conformations (three-dimensional structure) of proteins from their sequence of amino acids is a paradigm problem of biology. While massive progress is seen from deep learning (due in part to the large quantity of free high-quality data), essential issues remain only partly resolved, including modelling protein folding and dynamics (proteins wiggle about), assessing the impact of mutations and protein design, and separating uncertainty due to dynamics from experimental settings such as noise and missing data. These issues could benefit from a probabilistic approach based on deep generative models. It is used to drive the development of specialized programming languages (like Pyro and NumPyro), allowing the expression and efficient combination of deep generative protein models with experimental observations. Due to rotations as nuisance parameters in protein structure models, distribution is used over dihedral angles to represent them. In practice, these languages are extended with circular distributions whose application goes beyond protein structure prediction. The focus is on the protein structure prediction problem, showcasing how to express a deep generative protein structure model in NumPyro, and discussing modelling extensions and possible pitfalls when working with these types of languages.