A daily-updated tree of (sequenced) life as a reference for genome research

The extent of agreement between the NCBI taxonomy and the molecular data.




We report a daily-updated sequenced/species Tree Of Life (sTOL) as a reference for the increasing number of cellular organisms with their genomes sequenced. The sTOL builds on a likelihood-based weight calibration algorithm to consolidate NCBI taxonomy information in concert with unbiased sampling of molecular characters from whole genomes of all sequenced organisms. Via quantifying the extent of agreement between taxonomic and molecular data, we observe there are many potential improvements that can be made to the status quo classification, particularly in the Fungi kingdom; we also see that the current state of many animal genomes is rather poor. To augment the use of sTOL in providing evolutionary contexts, we integrate an ontology infrastructure and demonstrate its utility for evolutionary understanding on: nuclear receptors, stem cells and eukaryotic genomes. The sTOL (http://supfam.org/SUPERFAMILY/sTOL) provides a binary tree of (sequenced) life, and contributes to an analytical platform linking genome evolution, function and phenotype.

Scientific Reports
 
3,
 
Article number:
 
2015
 
doi:10.1038/srep02015
Received
 
Accepted
 
Published