COMPSTAT 2022: Start Registration
View Submission - COMPSTAT2022
A0569
Title: Towards the optimization of large-scale phylogenetic trees Authors:  Catia Vaz - INESC-ID / Instituto Superior de Engenharia de Lisboa, Instituto Politécnico de Lisboa (Portugal) [presenting]
Alexandre Francisco - INESC-ID and IST, Universidade de Lisboa (Portugal)
Abstract: Several distance-based phylogenetic inference algorithms, widely used in the surveillance of infectious diseases, outbreaks investigation and studies of the natural history of infections, follow a hierarchical clustering approach to compute phylogenetic trees. Such algorithms differ in the similarity distance and in the optimization criteria used. Inferred trees might also not necessarily represent the best tree for the underlying evolution model. For instance, in the case of combinatorial optimization algorithms, such as goeBURST, that provide an optimal tree under a given criterion, we might not necessarily obtain the most representative phylogeny because distance does not always correlate with divergence time. And although we can further optimize trees using methods based on Subtree Pruning and Regrafting, Nearest Neighbor Interchange, or Tree Bisection and Reconnection, these methods are often expensive to compute, namely for large studies. We present then an extension of goeBURST that relies on efficient local optimizations to improve the inferred phylogeny, and which is applied to selected edges based on the maximum likelihood of two alternative evolutionary models. Underlying principles will be presented as well as results for both precision and sensitivity of the algorithm for reconstructing phylogenetic trees over simulated data.