|DAVENPORT, KIMBERLY - University Of Idaho|
|WORLEY, KIMBERLY - Baylor College Of Medicine|
|MURALI, SHWETHA - Baylor College Of Medicine|
|SALAVATI, MAZDAK - Roslin Institute|
|CLARK, EMILY - Roslin Institute|
|COCKETT, NOELLE - Utah State University|
|Heaton, Michael - Mike|
|Smith, Timothy - Tim|
|MURDOCH, BRENDA - University Of Idaho|
|Rosen, Benjamin - Ben|
Submitted to: Gigascience
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 12/25/2021
Publication Date: 2/4/2022
Citation: Davenport, K.M., Bickhart, D.M., Worley, K.C., Murali, S.C., Salavati, M., Clark, E.L., Cockett, N., Heaton, M.P., Smith, T.P., Murdoch, B.M., Rosen, B.D. 2022. An improved ovine reference genome assembly to facilitate in depth functional annotation of the sheep genome. Gigascience. 11. Article giab096. https://doi.org/10.1093/gigascience/giab096.
Interpretive Summary: Sheep are the foundational livestock species in the US Wool industry. In this manuscript, we present improved genetics resources for sheep. These resources will improve selective breeding of all sheep breeds for improved production. The resources are such high quality that they will be considered the preferred sheep genetic resources of the international community.
Technical Abstract: Background The domestic sheep (Ovis aries) is an important agricultural species raised for meat, wool, and milk across the world. A high-quality reference genome for this species enhances the ability to discover genetic mechanisms influencing biological traits. Further, a high-quality reference genome allows for precise functional annotation of gene regulatory elements. The rapid advances in genome assembly algorithms and emergence of increasingly long sequence read length provide the opportunity for an improved de novo assembly of the sheep reference genome. Findings Short-read Illumina (55x coverage), long-read PacBio (75x coverage), and Hi-C data from this ewe retrieved from public databases were combined with an additional 50x coverage of Oxford Nanopore data and assembled with canu v1.9. The assembled contigs were scaffolded using Hi-C data with Salsa v2.2, gaps filled with PBsuite v15.8.24, and polished with Nanopolish v0.12.5. After duplicate contig removal with PurgeDups v1.0.1, chromosomes were oriented and polished with two rounds of a pipeline which consisted of freebayes v1.3.1 to call variants, Merfin to validate them, and BCFtools to generate the consensus fasta. The ARS-UI_Ramb_v2.0 assembly has improved continuity (contig N50 of 43.19 Mb) with a 19-fold and 38-fold decrease in the number of scaffolds compared with Oar_rambouillet_v1.0 and Oar_v4.0. ARS-UI_Ramb_v2.0 has greater per-base accuracy and fewer insertions and deletions identified from mapped RNA sequence than previous assemblies. Conclusions The ARS-UI_Ramb_v2.0 assembly is a substantial improvement that will optimize the functional annotation of the sheep genome and facilitate improved mapping accuracy of genetic variant and expression data for traits in sheep.