|Wai, Ching Man|
Submitted to: Gigascience
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/13/2017
Publication Date: 12/13/2017
Citation: Edger, P.P., Vanburen, R., Colle, M., Poorten, T.J., Wai, C., Niederhuth, C.E., Alger, E., Ou, S., Acharya, C.B., Wang, J., Callow, P., Mckain, M.R., Shi, J., Collier, C., Xiong, Z., Mower, J.P., Slovin, J.P., Hytönen, T., Jiang, N., Childs, K., Knapp, S.J. 2017. Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity. Gigascience. https://doi.org/10.1093/gigascience/gix124. Interpretive Summary: The dessert strawberry is a highly complex plant to study in the laboratory, so tools have become available over the last 5 to 10 years for using the simpler and closely related woodland strawberry for understanding traits important for strawberry production. One such tool is the sequence of the DNA, which is important for identifying the genes that control traits such as fruit quality and longevity or resistance to disease, as well as traits related to how strawberry plants grow and reproduce. This paper describes a substantial improvement of the existing woodland strawberry sequence using more modern technology that allowed for the discovery of a large amount of additional sequence not present in previous versions, and the identification of almost 1,500 additional genes. This improved genome sequence will enhance scientist's understanding of the function of genes involved in production of improved varieties with enhanced disease resistance, and will enable breeders to produce improved varieties using the most modern techniques.
Technical Abstract: Although draft genomes are available for most agronomically important plant species, the majority are incomplete, highly fragmented, and often riddled with assembly and scaffolding errors. These assembly issues hinder advances in tool development for functional genomics and systems biology. Here we utilized a robust, cost-effective approach to produce 'platinum' quality reference genomes. We report a near-complete genome of diploid woodland strawberry (Fragaria vesca) using single-molecule real-time sequencing from Pacific Biosciences (PacBio). This assembly has a contig N50 length of ~7.9 Mb, representing a ~300 fold improvement of the previous version. The vast majority (>99.8%) of the assembly was anchored to seven pseudomolecules using two sets of optical maps from Bionano Genomics. We obtained ~24.96 million base pairs (Mb) of sequence not present in the previous version of the F. vesca genome and produced an improved annotation that includes 1,496 new genes. Comparative syntenic analyses uncovered numerous, large-scale scaffolding errors present in each chromosome in the previously published version of the F. vesca genome. Our results highlight the need to improve existing short-read based reference genomes. Furthermore, we demonstrate how genome quality impacts commonly used analyses for addressing both fundamental and applied biological questions.