Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #257741

Title: Accuracy of genomic selection in barley breeding programs: a simulation study based on the real SNP data

Author
item HIROYOSHI, IWATA - University Of Tokyo
item Jannink, Jean-Luc

Submitted to: Crop Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/18/2011
Publication Date: N/A
Citation: N/A

Interpretive Summary: Recent advances in genotyping technology have made data acquisition of genome-wide marker polymorphisms cost effective. Statistical methods that can best take advantage of these marker data, however, have yet to be identified. One promising area is “genomic selection” in which the performance of a breeding line is predicted on the basis of genome-wide markers. The aim of this study was to compare the accuracy of statistical methods for genomic selection through simulations based on real barley marker data. We evaluated the performance of three categories of statistical methods over a range of relevant simulation conditions: Bayesian methods, multivariate regression methods, and machine learning methods for building prediction models. The prediction accuracy was high in Bayesian methods and ridge regression. Under medium and high trait heritability (h2 = 0.4 and 0.6), the mean of predictions from all methods was more accurate than predictions based on any single method, suggesting that different methods predict different aspects of breeding line performance. The advantage of genomic over phenotypic prediction was larger under lower heritability and a larger training dataset. The difference in prediction accuracy for traits affected by many genes versus relatively few genes was small. The models were also useful in increasing the accuracy of predictions on breeding lines with phenotypic records. The results indicate that further research into the combination of statistical methods is warranted.

Technical Abstract: The aim of this study was to compare the accuracy of genomic selection (i.e., selection based on genome-wide markers) to phenotypic selection through simulations based on real barley SNPs data (1325 SNPs x 863 breeding lines). We simulated 100 QTL at randomly selected SNPs, which were dropped from the data. The sum of heritability of all the QTL was set as 0.1, 0.2, 0.4 or 0.6. We generated 100 datasets for each simulation condition. A dataset was then separated into training (N = 200, 400, 600) and validation sets. Bayesian methods, multivariate regression methods (partial least square, ridge regression) and machine learning methods (random forest, support vector machine) were used for building prediction models. The prediction accuracy was high in Bayesian methods and ridge regression. Under medium and high heritability (h2 = 0.4 and 0.6), the mean of predictions from all methods was more accurate than predictions based on any single method, suggesting that different methods captured different aspects of genotype-phenotype associations. The advantage of genomic over phenotypic selection was larger under lower heritability and a larger training dataset. The difference in prediction accuracy between polygenic and oligogenic traits was small. The models were also useful in increasing the accuracy of predictions on breeding lines with phenotypic records. The results indicate that genomic selection can be efficiently used in barley breeding programs.