A0633
Title: Statistical methods to analyze phylogenetic trees with non-identical leaf sets
Authors: Maria Valdez Cabrera - University of Washington (United States) [presenting]
Abstract: Phylogenetic trees are frequently used to describe the evolutionary history of a set of microorganisms. Different genes shared by these may differ in their evolutionary histories, motivating methods to analyze collections of trees. To allow comparisons between phylogenetic trees, a non-Euclidean metric space (BHV space) that accounts for the discrete branching structure of each tree and allows the branch lengths to vary continuously was introduced in 2001. Unfortunately, only trees with identical leaf sets are elements in this space. In practice, this might not be reasonable. Some microorganisms may not carry a particular gene, or a gene may not be detected in a sample due to laboratory and technical artefacts. Motivated by the path continuity of BHV geodesics, a metric space is proposed for phylogenetic trees with potentially non-identical leaf sets. An algorithm is introduced to compute the distance in our metric space and discuss the use of the Frechet mean as a potential summary for a tree collection. The long-term goal is to create statistical tools to analyze a collection of phylogenetic trees with non-identical leaf sets. This is joint work with Amy Willis.