Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #368108

Research Project: Enhancing Genetic Merit of Ruminants Through Improved Genome Assembly, Annotation, and Selection

Location: Animal Genomics and Improvement Laboratory

Title: De novo assembly of the cattle reference genome with single-molecule sequencing

Author
item Rosen, Benjamin
item Bickhart, Derek
item SCHNABEL, ROBERT - University Of Missouri
item KOREN, SERGEY - National Institutes Of Health (NIH)
item ELSIK, CHRISTINE - University Of Missouri
item TSENG, ELIZABETH - Pacific Biosciences Inc
item ROWAN, TROY - University Of Missouri
item LOW, WAI - University Of Adelaide
item ZIMIN, ALEKSEY - Johns Hopkins University
item COULDREY, CHRISTINE - Collaborator
item HALL, RICHARD - Pacific Biosciences Inc
item Li, Wenli
item RHIE, ARANG - National Institutes Of Health (NIH)
item GHURYE, JAY - University Of Maryland
item MCKAY, STEPHANIE - University Of Vermont
item THIBAUD-NISSEN, FRANCOISE - National Institutes Of Health (NIH)
item HOFFMAN, JINNA - National Institutes Of Health (NIH)
item MURDOCH, BRENDA - University Of Idaho
item Snelling, Warren
item McDaneld, Tara
item HAMMOND, JOHN - The Pirbright Institute
item SCHWARTZ, JOHN - The Pirbright Institute
item NANDOLO, WILSON - Lilongwe University Of Agriculture And Natural Resources
item HAGEN, DARREN - Oklahoma State University
item DREISCHER, CHRISTIAN - Computomics Gmbh
item SCHULTHEISS, SEBATIAN - Computomics Gmbh
item Schroeder, Steven - Steve
item PHILLIPPY, ADAM - National Institutes Of Health (NIH)
item Cole, John
item Van Tassell, Curtis - Curt
item Liu, Ge - George
item Smith, Timothy - Tim
item MEDRANO, JUAN - University Of California, Davis

Submitted to: Gigascience
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/14/2020
Publication Date: 3/1/2020
Citation: Rosen, B.D., Bickhart, D.M., Schnabel, R.D., Koren, S., Elsik, C.G., Tseng, E., Rowan, T.N., Low, W.Y., Zimin, A., Couldrey, C., Hall, R., Li, W., Rhie, A., Ghurye, J., McKay, S.D., Thibaud-Nissen, F., Hoffman, J., Murdoch, B.M., Snelling, W.M., McDaneld, T.G., Hammond, J.A., Schwartz, J.C., Nandolo, W., Hagen, D.E., Dreischer, C., Schultheiss, S.J., Schroeder, S.G., Phillippy, A.M.,Cole, J.B., Van Tassell, C.P., Liu, G., Smith, T.P.L., Medrano, J.F. 2020. De novo assembly of the cattle reference genome with single-molecule sequencing. GigaScience. 9(3):1-9. https://doi.org/10.1093/gigascience/giaa021.
DOI: https://doi.org/10.1093/gigascience/giaa021

Interpretive Summary: Genomic tools have enabled major advances in the genetic improvement of cattle since the release of the cattle reference genome in 2004. These tools require an accurate and complete genome assembly. We have applied the latest DNA sequencing and scaffolding technologies to generate an improved reference genome for cattle, ARS-UCD1.2. We used the same animal as the original to facilitate transfer and interpretation of results obtained from the earlier version but have improved the genome quality many fold. We also generated data to more accurately define what genes are present in cattle. This data has been made publicly available through NCBI.

Technical Abstract: Major advances in selection progress for cattle have been made following the introduction of genomic tools over the past 10-12 years. These tools depend upon the Bos taurus reference genome (UMD3.1.1), which was created using now-outdated technologies and suffers from a variety of deficiencies and inaccuracies. We present the new reference genome for cattle, ARS-UCD1.2, based on the same animal as the original to facilitate transfer and interpretation of results obtained from the earlier version, but applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly includes 2.7 Gb, and is >250x more continuous than the original assembly, with contig N50 >25 Mb and L50 of 32. We also greatly expanded supporting RNA-based data for annotation that identifies 30,396 total genes (21,039 protein coding). The new reference assembly is accessible in annotated form for public use.