1a.Objectives (from AD-416):
The objective of this cooperative research project is to design tools to process and annotate livestock genome sequences, including gene expression data, single nucleotide polymorphisms for breed specific Haplotype map development and also support the on-going genomic selection, genome-wide association analysis and QTL fine mapping projects.
1b.Approach (from AD-416):
Software will be developed or adapted for use in annotating whole genome sequence data being generated as part of international collaborations that include ARS, including gene prediction, use of comparative mapping information, and identifying novel genetic markers (polymorphisms) and positional candidate genes. Emphasis should be given to problems specific to post-analysis of the bovine genome sequencing project, as BFGL is leading those efforts and development of tools with wide utility for the community is a priority. Tools will be developed or adopted to quickly identify positional candidate genes from QTL map position integration with signatures of selection, gene expression and breed specific polymorphism information. Methods for comparison of genetic variation in coding genes between sub-species of cattle need to be investigated.
The cooperators received data from 48 million SNP discovered through sequencing 60 animals on a next generation sequence platform and ran these SNP through a series of bioinformatic tools to annotate the SNP. The 48,620,857 SNPs were functionally annotated using ANNOVAR, a genetic variant annotation program that is fast enough to annotate millions of SNPs on a desktop computer. More than 9,000 stop gain and loss codon mutations were identified. This result signified there were a number of false positive SNP in the dataset, so the SNP discovery analysis was rerun by ARS scientists. The cooperators submitted a written report of the initial findings using the flawed dataset.