|FELDMAN, MITCHELL - University Of California, Davis|
|PINCOT, DOMINIQUE - University Of California, Davis|
|FAMULA, RANDI - University Of California, Davis|
|VACHEV, MICHAELA - University Of California, Davis|
|MADERA, MARY - University Of California, Davis|
|ZERBE, PHILIPP - University Of California, Davis|
|MARS, KRISTIN - Pacific Biosciences Inc|
|PELUSO, PAUL - Pacific Biosciences Inc|
|RANK, DAVID - Pacific Biosciences Inc|
|OU, SHUJUN - Iowa State University|
|SASKI, CHRISTOPHER - Clemson University|
|ACHARYA, CHARLOTTE - University Of California, Davis|
|COLE, GLENN - University Of California, Davis|
|YOCCA, ALAN - Michigan State University|
|PLATTS, OKATT - Michigan State University|
|EDGER, PATRICK - Michigan State University|
|KNAPP, STEVEN - University Of California, Davis|
Submitted to: bioRxiv
Publication Type: Research Notes
Publication Acceptance Date: 11/4/2021
Publication Date: 11/4/2021
Citation: Hardigan, M.A., Feldman, M.J., Pincot, D.D., Famula, R.A., Vachev, M.V., Madera, M.A., Zerbe, P., Mars, K., Peluso, P., Rank, D., Ou, S., Saski, C.A., Acharya, C.B., Cole, G.S., Yocca, A.E., Platts, O., Edger, P.P., Knapp, S.J. 2021. Blueprint for phasing and assembling the genomes of heterozygous polyploids: application to the octoploid genome of strawberry. bioRxiv. https://doi.org/10.1101/2021.11.03.467115.
Interpretive Summary: Accurate and high-quality genome assemblies are essential tools and a foundation upon which modern plant breeding and genetics studies are supported. Polyploid plant species include numerous agriculturally important crops, including many of the small fruit and berry species enjoyed by American consumers. However, polyploid genomes are inherently complex, which has delayed the accurate assembly and annotation of important polyploid crop genomes relative to simpler diploid species. In this study, we show how state-of-the-art DNA sequencing technology, pedigree-informed analyses, and dense genetic maps can be combined to produce gold-standard genomes of polyploid plants, in this case, the highly heterozygous genome of octoploid strawberry (Fragaria × ananassa). We present the 'FaRR1' the assembled genome of a highly heterozygous, day-neutral strawberry cultivar from the University of California, as an example of a gold-standard polyploid plant genome and a tool for the global strawberry research community.
Technical Abstract: The challenge of allelic diversity for assembling haplotypes is exemplified in polyploid genomes containing homoeologous chromosomes of identical ancestry, and significant homologous variation within their ancestral subgenomes. Cultivated strawberry (Fragaria × ananassa) and its wild progenitors are outbred octoploids (2n = 8x = 56) in which up to eight homologous and homoeologous alleles are preserved. This introduces significant risk of haplotype collapse, switching, and chimeric fusions during assembly. Using third generation HiFi sequences from PacBio, we assembled the genome of the day-neutral octoploid F. × ananassa hybrid ‘Royal Royce’ from the University of California. Our goal was to produce subgenome- and haplotype-resolved assemblies of all 56 chromosomes, accurately reconstructing the parental haploid chromosome complements. Previous work has demonstrated that partitioning sequences by parental phase supports direct assembly of haplotypes in heterozygous diploid species. We leveraged the accuracy of HiFi sequence data with pedigree-informed sequencing to partition long read sequences by phase, and reduce the downstream risk of subgenomic chimeras during assembly. We were able to utilize an octoploid strawberry recombination breakpoint map containing 3.6 M variants to identify and break chimeric junctions, and perform scaffolding of the phase-1 and phase-2 octoploid assemblies. The N50 contiguity of the phase-1 and phase-2 assemblies prior to scaffolding and gap-filling was 11 Mb. The final haploid assembly represented seven of 28 chromosomes in a single contiguous sequence, and averaged fewer than three gaps per pseudomolecule. Additionally, we re-annotated the octoploid genome to produce a custom F. × ananassa repeat library and improved set of gene models based on IsoSeq transcript data and an expansive RNA-seq expression atlas. Here we present ‘FaRR1’, a gold-standard reference genome of F. × ananassa cultivar ‘Royal Royce’ to assist future genomic research and molecular breeding of allo-octoploid strawberry.