Submitted to: Journal of Dairy Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: January 22, 2011
Publication Date: May 1, 2011
Citation: Olson, K.M., Van Raden, P.M., Tooker, M.E., Cooper, T.A. 2011. Differences among methods to validate genomic evaluations for dairy cattle. Journal of Dairy Science. 94(5):2613-2620. Interpretive Summary: Two methods of validation were used to test the genomic predictions for milk, health, and fertility traits on genotyped dairy cattle. One method used data of the training animals from four years ago to estimate single nucleotide polymorphism effects. Another method used the current data of the training animals to estimate the single nucleotide polymorphism effects. Both methods were tested by regression using the current data from the validation animals. The results varied by method, and results of the use of current data often overestimated the coefficient of determination.
Technical Abstract: Two methods of testing predictions from genomic evaluations were investigated. Data used were from the April 2010 and August 2006 official USDA genetic evaluations of dairy cattle. The training data set consisted of both cows and bulls that were proven (had own or daughter information) as of August 2006 and consisted of 8,022, 1,959, and 1,056 Holstein, Jersey, and Brown Swiss, respectively. The validation data set consisted of bulls that were unproven as of August 2006 and were proven by April 2010 with 2,653, 411, and 132 Holstein, Jersey and Brown Swiss for the production traits. Method 1 used the training animal’s estimated breeding values (EBV) from August of 2006. Method 2 used the training animal’s April 2010 EBV to estimate SNP effects. Both methods were then tested using multiple regressions with the same validation animals. In both cases, the validation animals were tested using the deregressed April 2010 EBV. All traits that had genomic evaluations from the official USDA April, 2010 genetic evaluations were tested. Results included bias, differences from expected regressions (calculated using selection intensities), and the coefficient of determination. The genomic information increased the predictive ability for most of the traits in all of the breeds. The two methods of testing resulted in some differences that would affect interpretation of results. The coefficient of determination was higher for all traits using method 2. This was the expected result because the data was not independent, because evaluations of the validation bulls contributed to their sires’ evaluations. The regression coefficients from method 2 were often higher than the regression coefficients from method 1. Many traits had regression coefficients that were more than two standard deviations from the expected regressions when using method 2. This was partially due to the lack of independence of the training and validation data sets. Most traits did have some level of bias in the prediction equations, regardless of breed. The use of method 1 made it possible to evaluate the increased accuracy in proven first crop bulls evaluations by using genomic information. Proven first crop bulls had an increase in accuracy from the addition of genomic information into the proof. It is advised to use method 1 for validation of genomic evaluations.