Submitted to: Genes, Genomes, and Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/10/2012
Publication Date: 11/1/2012
Publication URL: http://DOI: 10.1534/g3.112.004259/-/DC1
Citation: Endelman, J.B., Jannink, J. 2012. Shrinkage estimation of the realized relationship matrix. Genes, Genomes, and Genomics. 2(11):1405-1413.
Interpretive Summary: Crop improvement consists of repeated cycles of selecting superior lines and intermating them to produce new offspring. The breeding value of an individual represents the expected performance of its offspring and is a key parameter used by breeders when selecting parents for mating. With genomic selection, DNA sequence data is combined with historical trait data to predict breeding values and accelerate the rate of crop improvement. A classic but still highly effective prediction method is based on estimating genetic relationships from the sequence data. In this paper we present a comprehensive theoretical foundation for prediction based on the relationship matrix. For cost effectiveness, it is advantageous for breeders to use as few genetic markers, or sequence data points, as possible without compromising the accuracy of the predictions. We demonstrate that a statistical technique known as shrinkage estimation can improve prediction accuracy with low-density markers when moderate-accuracy phenotypes are available. Our results suggest shrinkage estimation has the potential to improve the analysis of single-replicate and multi-environment yield trials in plant breeding.
Technical Abstract: The additive relationship matrix plays an important role in mixed model prediction of breeding values. For genotype matrix X (loci in columns), the product XX' is widely used as a realized relationship matrix, but the scaling of this matrix is ambiguous. Our first objective was to derive a proper scaling such that the mean diagonal element equals 1+f, where f is the inbreeding coefficient of the current population. The result is a formula involving the covariance matrix for sampling genomic loci, which must be estimated with markers. Our second objective was to investigate whether shrinkage estimation of this covariance matrix can improve the accuracy of breeding value (GEBV) predictions with low-density markers. Using an analytical formula for shrinkage intensity that is optimal with respect to mean-squared error, simulations revealed that shrinkage can significantly increase GEBV accuracy in unstructured populations, but only for phenotyped lines; there was no benefit for unphenotyped lines. The accuracy gain from shrinkage increased with heritability, but at high heritability (> 0.6) this benefit was irrelevant because phenotypic accuracy was comparable. These trends were confirmed in a commercial pig population with progeny-test-estimated breeding values. For an anonymous trait where phenotypic accuracy was 0.58, shrinkage increased the average GEBV accuracy from 0.56 to 0.62 (SE < 0.00) when using random sets of 384 markers from a 60K array. We conclude that when moderate-accuracy phenotypes and low-density markers are available for the candidates of genomic selection, shrinkage estimation of the relationship matrix can improve genetic gain.