Location: Plant, Soil and Nutrition ResearchTitle: Marker genotype imputation in a low-marker-density panel with a high-marker-density reference panel: accuracy evaluation in barley breeding lines Author
Submitted to: Crop Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/28/2009
Publication Date: 4/20/2010
Citation: Iwata, H., Jannink, J. 2010. Marker genotype imputation in a low-marker-density panel with a high-marker-density reference panel: accuracy evaluation in barley breeding lines. Crop Science. 50:1269-1278. Interpretive Summary: Recent advances in genotyping technology have made data acquisition of genome-wide marker polymorphisms cost effective. Although the development of cheap and high-throughput genotyping assays have made large-scale genotyping experiments feasible, it is not necessarily efficient to genotype all markers for all samples. Many polymorphisms in a genomic region may be highly correlated with each other such that it may be possible to impute the genotype of some markers based on the genotype of other markers. The Barley Coordinated Agricultural Project (CAP; www.BarleyCAP.org) was formed to develop genomic and statistical tools for whole genome association studies and integrate them into US barley breeding programs. The project maintains a Barley Core set consisting 102 lines that has been typed at over 3,800 SNPs and 1,400 DArT markers. Barley breeding lines, however, will be typed at a subset of 1,500 or 3,000 markers. The research reported in this paper evaluated our ability to impute marker scores on to barley breeding lines using Barley Core data. We found that 92% of markers were imputed correctly more than 90% of the time. This research provides evidence that a strategy of genotyping breeding lines at a comparatively low density and low cost, then imputing marker scores based on a high density panel should improve our chances of detecting genetic factors affecting the phenotype.
Technical Abstract: We evaluated a strategy in which the scores of markers untyped in a low-density experimental panel were imputed on the basis of data from a high-density reference panel, in its application to whole-genome genotyping of barley breeding lines. Using a barley core set consisting of 98 lines genotyped with 3205 markers (high-density reference panel), we imputed marker scores untyped in 863 barley breeding lines genotyped with 1330 common markers (low-density experimental panel). In repeated analyses, the scores of one common marker were masked in the experimental panel, and then imputed as an untyped marker. Imputation accuracy was evaluated by comparing imputed scores with true ones. The correct imputation rate was more than 0.9 in 92% of markers. The square of correlation coefficient between true and imputed scores was more than 0.6 in 86% of the markers. Factors affecting imputation accuracy were minor allele frequency, linkage disequilibrium with neighbor common markers, minimum distance to the closest common marker, and degree of differentiation among populations. Actual quantitative trait loci (QTL) would be unobserved in both reference and experimental panels. Markers masked in both panels to mimic this situation sometimes showed larger correlation to imputed markers than to typed common markers, indicating that impution can sometimes capture the variation of unknown QTL better than the genotypes of common markers.