NOVEL SNP DISCOVERY AND GENOME ASSEMBLY OF BOS INDICUS (NELORE) CATTLE
Animal Genomics and Improvement Laboratory
2012 Annual Report
1a.Objectives (from AD-416):
ARS and the COOPERATOR (UNESP-Aracatuba) are interested in developing a second draft sequence assembly of the bovine genome based on the bovine sub-species, Bos indicus, which encompasses the prototypical beef and dairy animal in tropical and sub-tropical production environments. Such a resource is essential for developing informative SNP markers to intensify outputs in current production scenarios and enable selection of locally-adapted breeds in developing-world cattle populations. Our Project Plan has two objectives: .
1)generate a 40X genome coverage of DNA sequence from a linebred Nelore bull to use in building a genome sequence assembly and.
2)generate 10X coverage from 10 additional Nelore animals (1X each animal) to identify SNP for development of a 660K SNP assay that contains informative SNP for all breed types of cattle. The COOPERATOR has the funding and genomic DNA material to properly initiate this project, and the expertise and knowledge of the extensive cattle production systems to extend application of the results to improve production. ARS has the expertise and facilities to generate the DNA sequence information on next-generation sequencing platforms and characterize the SNP that are potentially informative across all breeds of cattle.
1b.Approach (from AD-416):
The project will be managed jointly between ARS and the COOPERATOR. The COOPERATOR will evaluate the Nelore herdbook and mtDNA sequence to guide collection of appropriate tissues and/or DNA material for sequencing. ARS will analyze the heterozygosity indices of candidate animals using BovineSNP50 data. This analysis will aid final selection by the COOPERATOR of the appropriate animal for genome sequencing. The COOPERATOR will then provide genomic DNA templates from this animal and 10 additional Nelore animals (for 1X coverage sequencing/animal) along with funding to purchase reagents that will generate a 40X genome equivalent of DNA sequence. ARS will generate the DNA libraries, and provide next-generation DNA sequencing services to generate 50X genome coverage of sequence at cost. ARS will analyze, store, and distribute sequence information (as agreed upon with the COOPERATOR) that will be used for both genome assembly and SNP discovery. Identification of SNP for marker development will be done by ARS. Both ARS and the COOPERATOR will jointly aid development of the genome assembly led by outside parties that possess the necessary advanced expertise for this process. The COOPERATOR will provide some annotation expertise to help prepare the data for publication.
All of these activities will be considered true collaborations and thus, by definition, each party will be considered to have provided a true intellectual contribution consistent with authorship.
All ARS objectives were completed and a draft genome assembly (Nelore 1.0) produced from 90X genome coverage by non-funded partners from the University of Maryland is available for use upon request. In this draft assembly, more than 90% of the genome is in sequence contigs of 2 kilobases (kb) or larger and scaffolds of 48 kb or larger. Final genome assembly and annotation are still in progress, which rely on additional long insert library sequencing from non-funded partners at the University of Sao Paolo-Piracicaba for completion. It should be noted that some sequence production contribution and experimental planning were derived from USMARC (Clay Center, NE) and production of cDNA libraries to improve annotation was completed at USDA, LARRL (Miles City, MT). ARS scientists also initiated improvement of the assembly through collaboration with the Advanced Technology Center at SAIC (NIH/NCI). This effort produced 15 billion base pairs of third-generation sequence using a PacBio sequencer. This research supported two objectives of its related in-house project: .
1)to develop biological resources and computational tools to enhance characterization of the bovine genome sequence (obj. #1), and.
2)to use genotypic data to enhance genetic improvement through development and implementation of whole-genome selection and enhanced parentage verification approaches (obj. #2).