Skip to main content
ARS Home » Research » Publications » Publications at this Location

Title: Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data

Author
item SHEN, XING-XING - Vanderbilt University
item ZHOU, XIAOFAN - Vanderbilt University
item KOMINEK, JACEK - University Of Wisconsin
item Kurtzman, Cletus
item HITTINGER, CHRIS - University Of Wisconsin
item ROKAS, ANTONIS - Vanderbilt University

Submitted to: G3, Genes/Genomes/Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/19/2016
Publication Date: 12/1/2016
Citation: Shen, X.-X., Zhou, X., Kominek, J., Kurtzman, C.P., Hittinger, C.T., Rokas, A. 2016. Reconstructing the backbone of the Saccharomycotina yeast phylogeny using genome-scale data. G3, Genes/Genomes/Genetics. 6(12):3927-3939.

Interpretive Summary: Accurate identification of yeasts is commonly made from DNA “barcode” sequences, but these sequences are usually too short to determine relationships among species. In an effort to understand if unique metabolic properties of yeasts are found only among closely related species, the genomes of 86 yeasts that represent 9 of the 11 major lineages were used to generate a 1,233-gene data matrix that included many species of agricultural, biotechnological, and taxonomic importance. Sequence analysis showed that certain unique properties, such as ability to grow on methanol, are found only among closely related species. These data demonstrate that genome sequences can be used to predict metabolic properties among species, which will allow selection of yeasts to address specific problems in biotechnology and agriculture.

Technical Abstract: Understanding the phylogenetic relationships among the yeasts of the subphylum Saccharomycotina is a prerequisite for understanding the evolution of their metabolisms and ecological lifestyles. In the last two decades, the use of rDNA and multi-locus data sets has greatly advanced our understanding of the yeast phylogeny, but many deep relationships remain unsupported. In contrast, phylogenomic analyses have involved relatively few taxa and lineages that were often selected with limited considerations for covering the breadth of yeast biodiversity. Here we used genome sequence data from 86 publicly available yeast genomes representing 9 of the 11 major lineages and 10 non-yeast fungal outgroups to generate a 1,233-gene, 96-taxon data matrix. Species phylogenies reconstructed using two different methods (concatenation and coalescence) and two data matrices (amino acids or the first two codon positions) yielded identical and highly supported relationships between the 9 major lineages. Aside from the lineage comprised by the family Pichiaceae, all other lineages were monophyletic. Most interrelationships among yeast species were robust across the two methods and data matrices. However, 8 of the 93 internodes conflicted between analyses or data sets, mostly in the clade defined by species that have reassigned the CUG codon to encode serine, instead of leucine, or in the clade defined by a whole genome duplication. These phylogenomic analyses provide a robust roadmap for future comparative work across the yeast subphylum in the disciplines of taxonomy, molecular genetics, evolutionary biology, ecology, and biotechnology. To further this end, we have also provided a BLAST server to query the 86 Saccharomycotina genomes, which can be found at http://y1000plus.org/blast.