|
The study of the evolutionary relationships between living
organisms, or phylogeny, is central to biology. Relationships among the organisms (or taxa) are modeled as a phylogenetic tree.
Phylogeny reconstruction usually produces multiple trees. Having more than one tree is unsatisfactory and the trees are typically
combined into one "representative" tree using a consensus method.
In particular, computational methods construct numerous trees with the same objective score. A consensus of the top scoring trees
is returned as the answer. Our experiments indicate that the consensus of trees with near optimal scores is sufficiently close
topologically to the consensus of trees with the best known scores.
Thus, the phylogenetic search heuristics can be stopped significantly
earlier than is currently done. This can save weeks of computation for large datasets. We propose an objective criterion that allows a user to decide when the trees are "good enough" and present online consensus algorithms in aid of the implementation of this criterion.
Another source of multiple phylogenetic reconstructions is different types of data, such as morphology, geography, paleontology. The data can be viewed as constraints imposed on the structure of phylogeny. We present a new constraint-based approach for phylogeny reconstruction that is capable of handling heterogeneous data. We
view tree consensus methods as techniques for combining various types of constraints and analyze their properties in this context.
This work is joint with Bernard M. E. Moret, Usman Roshan, Tandy J. Warnow, and Tiffani L. Williams.
|