Skip to main content
ARS Home » Midwest Area » Madison, Wisconsin » U.S. Dairy Forage Research Center » Cell Wall Biology and Utilization Research » Research » Publications at this Location » Publication #341020

Research Project: Determining Influence of Microbial, Feed, and Animal Factors on Efficiency of Nutrient Utilization and Performance in Lactating Dairy Cows

Location: Cell Wall Biology and Utilization Research

Title: Exploiting long read sequencing technologies to establish high quality highly contiguous pig reference genome assemblies

item Warr, Amanda
item Hall, Richard
item Kim, Kirsti
item Tseng, Elizabeth
item Koren, Sergey
item Phillippy, Adam
item Bickhart, Derek
item Rosen, Benjamin
item Schroeder, Steven - Steve
item Hume, David
item Talbot, Richard
item Rund, Laurie
item Schook, Lawrence
item Chow, William
item Howe, Kirstin
item Nonneman, Danny - Dan
item Rohrer, Gary
item Putnam, Nicholas
item Green, Ed
item Watson, Mick
item Smith, Timothy - Tim
item Archibald, Alan

Submitted to: Plant and Animal Genome Conference
Publication Type: Abstract Only
Publication Acceptance Date: 10/31/2016
Publication Date: 1/14/2017
Citation: Warr, A., Hall, R., Kim, K., Tseng, E., Koren, S., Phillippy, A.M., Bickhart, D.M., Rosen, B.D., Schroeder, S.G., Hume, D.A., Talbot, R., Rund, L., Schook, L.B., Chow, W., Howe, K., Nonneman, D.J., Rohrer, G.A., Putnam, N., Green, E., Watson, M., Smith, T.P., Archibald, A.L. 2017. Exploiting long read sequencing technologies to establish high quality highly contiguous pig reference genome assemblies [abstract]. Plant and Animal Genome Conference XX, January 14-18, 2017, San Diego, California. Paper No. 25025.

Interpretive Summary:

Technical Abstract: The current pig reference genome sequence (Sscrofa10.2) was established using Sanger sequencing and following the clone-by-clone hierarchical shotgun sequencing approach used in the public human genome project. However, as sequence coverage was low (4-6x) the resulting assembly was only of draft quality. We have built new de novo genome assemblies from whole genome shotgun (WGS) sequence reads generated using Pacific Biosciences (PacBio) long read sequencing technology for two pigs – the original reference animal (Duroc sow 2-14) and a Duroc/Landrace/Yorkshire crossbred barrow. About 60-70x coverage WGS data per animal were assembled with the Falcon assembler and error corrected with Quiver/Arrow and Pilon using high coverage WGS PacBio and Illumina reads, respectively. The estimated accuracy (99.999%) of the Duroc assembly meets the requirement of a Gold standard finished sequence. The Duroc assembly was scaffolded with paired-end reads from isogenic BAC and fosmid clones. The crossbred assembly was scaffolded using Dovetail’s Hi-Rise. The current statistics for these assemblies are: Duroc 2-14 (Sscrofa11) for SSC1-18, SSCX (2.39 Gbp, 122 contigs; contig N50=58.5 Mbp; scaffold N50=107.6 Mbp); Duroc/Landrace/Yorkshire crossbred for SSC1-18, SSCX, SSCY (2.62 Gbp, 14,924 contigs; contig N50 =6.5 Mbp; scaffold N50=132 Mbp). The BAC and fosmid clone resource from Duroc 2-14 will facilitate further targeted sequence closure. These improved genome assemblies will be a key resource for research in pigs and will enable applications in agriculture and biomedicine. The assemblies are being deposited in the public database under the pre-publication data release terms of the Toronto Statement (Nature 461:168-70).