Title: Effects of reduced panels, reference sizes, reference origins, and genetic relationship on imputation of genotypes in Hereford cattle Authors
|Huang, Y -|
|Maltecca, C -|
|Cassady, J -|
Submitted to: Journal of Animal Science
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: June 19, 2012
Publication Date: N/A
Interpretive Summary: Using genomics in research and genetic selection relies on single nucleotide polymorphism (SNP) marker panels. These SNP panels differ in density ranging from a few hundred (cheap) to more than 600,000 markers (costly). However, not all animals will be genotyped with the same panel. To make full use of data collected from different panels, SNP genotypes that were not assayed in the low density samples will be imputed. This study assessed factors affecting accuracy of that imputation. Data were from Line 1 and industry Hereford cattle. Reduced panels with high minor allele frequency and evenly spaced SNP had lower imputation error rates. Increased relationship between animals genotyped with the high and low density panels improved accuracy of imputation. Thus, imputation error rate may be reduced by optimal design of the reduced panel and choice of animals to be genotyped with each panel. These results are important because they can lead to more efficient use of funds in research and in evaluation of breeding stock.
Technical Abstract: The objective of this study was to investigate alternative methods for designing and utilizing reduced single nucleotide polymorphism (SNP) panels for imputing SNP genotypes. Two purebred Hereford populations, an experimental population known as Line 1 Hereford (L1, N=240) and registered Hereford with American Hereford Association (AHA, N = 311), were utilized. Using different reference samples of 62 to 311animals with 39,497 SNPs on 29 autosomes, and study samples of 57 or 62 animals for which genotypes were available for ~2,600 SNPs (reduced panels), imputations were performed to predict the other ~36,900 loci which had been masked. An imputation package including LinkPHASE and DAGPHASE (Druet and Georges, 2009) was used for imputation. Four reduced panels differing in minor allele frequency (MAF) and marker spacing were evaluated. Reference individuals were either from L1 or AHA. Among animals with genotypes, genetic relationships were estimated based on molecular marker genotypes or pedigree. Reduced panel design, number of animals in the reference sample, reference origins and the genetic relationships between animals in the reference and study samples all affected imputation error (P < 0.001). Across genotyping schemes, the reduced panel with high MAF (> 0.35) and evenly spaced SNPs had lowest imputation error rate (P < 0.001). Molecular and pedigree relationships were used to represent genetic relationship. A 0.1 increase in average molecular relationship or average pedigree relationship within the study samples resulted in an 11.33 or 15.00% decrease in imputation error rate, respectively. Reference samples from the L1 population had a 1.50% higher imputation error rate than reference samples from admixed population (P < 0.001). When using pedigree relationship or molecular relationship as a covariate increasing of the number of animals in the reference panel decreased imputation error rate by -2.9 or -2.1%, respectively. Based on these results it was concluded that imputation error rate may be reduced through optimization of reduced panel design and genotyping strategy.