Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #327364

Title: Selection of sequence variants to improve dairy cattle genomic predictions

item Tooker, Melvin
item Vanraden, Paul
item Bickhart, Derek
item O'CONNELL, JEFFREY - University Of Maryland

Submitted to: Journal of Dairy Science
Publication Type: Abstract Only
Publication Acceptance Date: 4/21/2016
Publication Date: 7/9/2016
Citation: Tooker, M.E., Van Raden, P.M., Bickhart, D.M., O'Connell, J.R. 2016. Selection of sequence variants to improve dairy cattle genomic predictions. Journal of Dairy Science. 99(E-Suppl. 1)/Journal of Animal Science. 94(E-Suppl. 5):138(abstr. 298).

Interpretive Summary:

Technical Abstract: Genomic prediction reliabilities improved when adding selected sequence variants from run 5 of the 1,000 bull genomes project. High density (HD) imputed genotypes for 26,970 progeny tested Holstein bulls were combined with sequence variants for 444 Holstein animals. The first test included 481,904 candidate sequence SNPs consisting of 107,471 exonic, 9,422 splice, 35,242 untranslated regions at the beginning and ending of genes, 254,907 upstream, and 74,862 downstream variants, for a total of 762,588 after merging with HD genotypes that included 312,614 SNPs. The second test also included 249,966 insertions and deletions (indels) and applied stricter edits, giving a total of 1,003,453 variants. Edits removed variants with minor allele frequency (MAF) <0.01, low call rates, incorrect map locations, excess heterozygotes, or low correlations of sequence and HD genotypes for the same variant, but kept candidate variants within or near genes. Edits also removed Mendelian conflicts between parents and progeny. Quality of imputation was assessed by keeping 404 of the sequenced animals in the reference and randomly choosing 40 animals as a test set. Their sequence genotypes were reduced to the subset in common with HD and then imputed back to sequence. The percentage of correctly imputed variants averaged 97.3% across all chromosomes in test 1 and 97.2% in test 2. Total time required to prepare, edit, and impute the sequence variants for 27,235 animals was about 5 days using fewer than 20 processors. Genomic predictions were computed using deregressed evaluations from August 2011 for 33 traits and 19,575 bulls, requiring about 3 days with 33 processors. Predictions were tested using later data of 3,983 bulls whose daughters were first phenotyped after August 2011. Many sequence variants had larger estimated effects than the nearby HD markers, but reliability of predictions in test 1 improved only 0.6 percentage points when sequence SNPs were added to HD, and only 0.4 higher than HD in test 2 when both SNPs and indels were included. However, selecting the 17,000 candidate SNPs with largest estimated effects and adding those to the 60,000 SNPs used routinely did improve reliabilities by 2.7 percentage points (67.4% vs. 64.7%) on average across traits. Those compare to 35.2% parent average reliability. Accuracy of prediction can improve by adding selected sequence SNPs to marker sets.