Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #231317

Title: Marker imputation in barley association studies

Author
item Jannink, Jean-Luc
item IWATA, HIROYOSHI - NARC, JAPAN
item BHAT, PRASANNA - UNIV. OF CA, RIVERSIDE
item Chao, Shiaoman
item WENZL, PETER - TRITICARTE, P/L
item MUEHLBAUER, GARY - UNIV. OF MN, ST. PAUL

Submitted to: The Plant Genome
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/13/2009
Publication Date: 3/1/2009
Citation: Jannink, J., Iwata, H., Bhat, P.R., Chao, S., Wenzl, P., Muehlbauer, G.J. 2009. Marker imputation in barley association studies. The Plant Genome. 2:11-22.

Interpretive Summary: Association mapping requires higher marker density than linkage mapping, potentially leading to more missing marker data and to higher genotyping costs. In human genetics, statistical methods have been developed to alleviate these problems by inferring (imputing) missing marker scores and the scores of markers that were typed in a permanent reference panel but not in ongoing experimental data. This manuscript presents evidence that these methods designed for human data also can be effective for barley association mapping studies. Our reference panel contained 98 lines, 2517 single nucleotide polymorphism (SNP) markers, and 716 Diversity Arrays Technology (DArT) markers. Averaged over markers, masked scores were correctly imputed 96.9% of the time. About 20% of the markers were chosen as tag markers for a reference panel. Despite this low number of tags, non-tag markers were accurately imputed in simulated experimental datasets. When the DArT markers were used as tags, the SNP markers were also accurately imputed, suggesting that the imputation method can be used to convert association information from one marker system (e.g., DArT) to that of another marker system (e.g., SNP). We believe marker imputation methods will have an important future in association studies in reducing genotyping burdens, as a component of marker tagging methods, and in reducing analysis problems due to missing marker data.

Technical Abstract: Association mapping requires higher marker density than linkage mapping, potentially leading to more missing marker data and to higher genotyping costs. In human genetics, methods exist to impute missing marker data and whole markers that were typed in a reference panel but not in the experimental dataset. Our objectives were to determine if an imputation method developed for human data (fastPHASE) could effectively impute missing data and completely missing markers in a barley association mapping reference panel. The reference panel contained 98 lines, 2517 single nucleotide polymorphism (SNP) markers, and 716 Diversity Arrays Technology (DArT) markers. Averaged over markers, fastPHASE imputed masked scores correctly 96.9% of the time. Out of all markers, 610 and 273, respectively, were chosen as tag markers in two- and six-row barley subpopulations. Despite this low number of tags, fastPHASE imputation accuracy was such that for about 80% of non-tag markers, the prediction r2 between imputed and true marker scores was 0.8 or higher. When the DArT markers were used as tags, the SNP markers were imputed with similarly high prediction r2, suggesting that the imputation method can be used to convert association information from one marker system (e.g., DArT) to that of another marker system (e.g., SNP). We believe marker imputation methods will have an important future in association studies in reducing genotyping burdens, as a component of marker tagging methods, and in reducing analysis problems due to missing marker data.